[Arm-dev] Fwd: Kernel problems on APM X-Gene

Fri Sep 22 07:14:03 UTC 2017
Gordan Bobic <gordan at redsleeve.org>

This sounds familiar. ISTR messages like this when I tried to use any PCIe
cards in my machine, and the machine would typically lock up as soon as the
driver for the PCIe card was loaded.

I eventually gave up on running with any PCIe cards.

What was interesting that the CentOS kernel package would run fine, but as
soon as I rebuilt a mainline kernel with the same config and 4KB memory
pages, the machine would become allergic to any PCIe card I put in it
(tried several AMD and Nvidia GPUs and several SATA controllers).

Do you have any PCIe cards in yours?


On 22 Sep 2017 01:35, "Jeremiah Rothschild" <jeremiah at franz.com> wrote:

On Thu, Sep 21, 2017 at 03:14:41AM -0700, Jeremiah Rothschild wrote:
> Hmm, interesting. Perhaps if I can figure out how to get the kernel
booting
> to actually display via video (or IPMI) then it would reveal an error
> message. It has never shown output after GRUB but I imagine there's a way
to
> route it.

I was able to get output by adding this to my kernel parameters:

console=ttyS0,115200 earlycon=uart8250,mmio32,0x1c020000

>From there, booting into 4.11, it hung here:

[    4.168868] acpi ACPI0007:07: CPPC data invalid or not present
[    4.175198] GHES: HEST is not enabled!
[    4.179167] ACPI GTDT: [Firmware Bug]: failed to get the Watchdog base
address.
[    4.187021] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled

>From there, I added 'acpi=off' to the kernel parameters. Now I can
successfully boot the 4.11 kernel.

However, the console is flooded with messages like this:

[   42.982471] pcieport 0002:00:00.0:   TLP Header: 60001004 000000ff
00000090 30032000
[   42.990201] pcieport 0002:00:00.0: AER: Device recovery failed
[   43.156938] pcieport 0002:00:00.0: AER: Multiple Uncorrected (Non-Fatal)
error received: id=0000
[   43.165695] pcieport 0002:00:00.0: PCIe Bus Error: severity=Uncorrected
(Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[   43.177385] pcieport 0002:00:00.0:   device [10e8:e004] error
status/mask=00100000/00000000
[   43.185695] pcieport 0002:00:00.0:    [20] Unsupported Request    (First)

I'm happy I can boot now. If I can also stop the errors, that'd be great.
_______________________________________________
Arm-dev mailing list
Arm-dev at centos.org
https://lists.centos.org/mailman/listinfo/arm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/arm-dev/attachments/20170922/4432d132/attachment-0006.html>