Hey everyone!
I cannot get any kernel past 4.5.0-23.el7.aarch64 to work on APM X-Gene hardware. This includes the new (CentOS 7.4) 4.11.0-22 kernel.
The last thing ever displayed during boot is this:
EFI stub: Booting Linux Kernel... EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... L3c Cache: 8MB
which is usual. However, nothing is logged to /var/log/messages (presumably it isn't making it far enough in the boot process) and I'm not sure how to further troubleshoot the issue.
Has anyone else been successful? Are there new patches I am unaware of? Can someone please help me debug deeper?
Thanks in advance!
j
Jeremiah,
Is this on Mustang board? Which version of bootloader are you using?
-Phong
+-----Original Message----- +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of Jeremiah +Rothschild +Sent: Thursday, September 21, 2017 1:45 PM +To: arm-dev@centos.org +Subject: [Arm-dev] Kernel problems on APM X-Gene + +Hey everyone! + +I cannot get any kernel past 4.5.0-23.el7.aarch64 to work on APM X-Gene +hardware. This includes the new (CentOS 7.4) 4.11.0-22 kernel. + +The last thing ever displayed during boot is this: + +EFI stub: Booting Linux Kernel... +EFI stub: Using DTB from configuration table EFI stub: Exiting boot +services and installing virtual address map... +L3c Cache: 8MB + +which is usual. However, nothing is logged to /var/log/messages +(presumably it isn't making it far enough in the boot process) and I'm +not sure how to further troubleshoot the issue. + +Has anyone else been successful? Are there new patches I am unaware of? +Can someone please help me debug deeper? + +Thanks in advance! + +j +_______________________________________________ +Arm-dev mailing list +Arm-dev@centos.org +https://lists.centos.org/mailman/listinfo/arm-dev
On Thu, Sep 21, 2017 at 04:19:55PM +0700, Phong Vo wrote:
Jeremiah,
Hi Phong, thank you for the reply.
Is this on Mustang board? Which version of bootloader are you using?
Pardon me for not being more specific. This is the Gigabyte MP30-AR0 board.
Using bootloader version: U-Boot 2013.04-mp30ar0_sw_1.18.04 (Sep 02 2015 - 16:38:06) REV: F06b ( uart0 )
-Phong
+-----Original Message----- +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of Jeremiah +Rothschild +Sent: Thursday, September 21, 2017 1:45 PM +To: arm-dev@centos.org +Subject: [Arm-dev] Kernel problems on APM X-Gene
+Hey everyone!
+I cannot get any kernel past 4.5.0-23.el7.aarch64 to work on APM X-Gene +hardware. This includes the new (CentOS 7.4) 4.11.0-22 kernel.
+The last thing ever displayed during boot is this:
+EFI stub: Booting Linux Kernel... +EFI stub: Using DTB from configuration table EFI stub: Exiting boot +services and installing virtual address map... +L3c Cache: 8MB
+which is usual. However, nothing is logged to /var/log/messages +(presumably it isn't making it far enough in the boot process) and I'm +not sure how to further troubleshoot the issue.
+Has anyone else been successful? Are there new patches I am unaware of? +Can someone please help me debug deeper?
+Thanks in advance!
+j +_______________________________________________ +Arm-dev mailing list +Arm-dev@centos.org +https://lists.centos.org/mailman/listinfo/arm-dev _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
W dniu 21.09.2017 o 11:32, Jeremiah Rothschild pisze:
On Thu, Sep 21, 2017 at 04:19:55PM +0700, Phong Vo wrote:
Jeremiah,
Hi Phong, thank you for the reply.
Is this on Mustang board? Which version of bootloader are you using?
Pardon me for not being more specific. This is the Gigabyte MP30-AR0 board.
Using bootloader version: U-Boot 2013.04-mp30ar0_sw_1.18.04 (Sep 02 2015 - 16:38:06) REV: F06b ( uart0 )
Have you considered moving to UEFI firmware?
I can confirm that 4.11 works on X-Gene machine as it booted fine on HPe Moonshot m400 cartridge:
[linaro@c8n1 ~]$ uname -a Linux c8n1 4.11.0-22.el7.2.aarch64 #1 SMP Thu Sep 14 17:01:49 CDT 2017 aarch64 aarch64 aarch64 GNU/Linux
My APM Mustang is offline so can not check there.
On Thu, Sep 21, 2017 at 11:52:40AM +0200, Marcin Juszkiewicz wrote:
W dniu 21.09.2017 o 11:32, Jeremiah Rothschild pisze:
On Thu, Sep 21, 2017 at 04:19:55PM +0700, Phong Vo wrote:
Jeremiah,
Hi Phong, thank you for the reply.
Is this on Mustang board? Which version of bootloader are you using?
Pardon me for not being more specific. This is the Gigabyte MP30-AR0 board.
Using bootloader version: U-Boot 2013.04-mp30ar0_sw_1.18.04 (Sep 02 2015 - 16:38:06) REV: F06b ( uart0 )
Have you considered moving to UEFI firmware?
I am a little hesitant to make changes since it is our only ARM development box. I could look into it, though. Thanks for the suggestion.
I can confirm that 4.11 works on X-Gene machine as it booted fine on HPe Moonshot m400 cartridge:
Hmm, interesting. Perhaps if I can figure out how to get the kernel booting to actually display via video (or IPMI) then it would reveal an error message. It has never shown output after GRUB but I imagine there's a way to route it.
[linaro@c8n1 ~]$ uname -a Linux c8n1 4.11.0-22.el7.2.aarch64 #1 SMP Thu Sep 14 17:01:49 CDT 2017 aarch64 aarch64 aarch64 GNU/Linux
My APM Mustang is offline so can not check there. _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
On Thu, Sep 21, 2017 at 03:14:41AM -0700, Jeremiah Rothschild wrote:
Hmm, interesting. Perhaps if I can figure out how to get the kernel booting to actually display via video (or IPMI) then it would reveal an error message. It has never shown output after GRUB but I imagine there's a way to route it.
I was able to get output by adding this to my kernel parameters:
console=ttyS0,115200 earlycon=uart8250,mmio32,0x1c020000
From there, booting into 4.11, it hung here:
[ 4.168868] acpi ACPI0007:07: CPPC data invalid or not present [ 4.175198] GHES: HEST is not enabled! [ 4.179167] ACPI GTDT: [Firmware Bug]: failed to get the Watchdog base address. [ 4.187021] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
From there, I added 'acpi=off' to the kernel parameters. Now I can
successfully boot the 4.11 kernel.
However, the console is flooded with messages like this:
[ 42.982471] pcieport 0002:00:00.0: TLP Header: 60001004 000000ff 00000090 30032000 [ 42.990201] pcieport 0002:00:00.0: AER: Device recovery failed [ 43.156938] pcieport 0002:00:00.0: AER: Multiple Uncorrected (Non-Fatal) error received: id=0000 [ 43.165695] pcieport 0002:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID) [ 43.177385] pcieport 0002:00:00.0: device [10e8:e004] error status/mask=00100000/00000000 [ 43.185695] pcieport 0002:00:00.0: [20] Unsupported Request (First)
I'm happy I can boot now. If I can also stop the errors, that'd be great.
What is DT mode?
From my understanding of the thread so far, both Jeremiah and I are running
Tianocore UEFI firmware, chain loaded from u-boot.
On Fri, Sep 22, 2017 at 10:06 AM, Marcin Juszkiewicz < marcin.juszkiewicz@linaro.org> wrote:
W dniu 22.09.2017 o 02:35, Jeremiah Rothschild pisze:
I'm happy I can boot now. If I can also stop the errors, that'd be great.
Seriously, check uefi firmware. I doubt that anyone tested CentOS 4.11 kernel on ARM server with U-Boot (or DT mode). _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
On Fri, Sep 22, 2017 at 08:14:03AM +0100, Gordan Bobic wrote:
Do you have any PCIe cards in yours?
Actually, no.
What is DT mode?
From my understanding of the thread so far, both Jeremiah and I are running
Tianocore UEFI firmware, chain loaded from u-boot.
That's right. I originally miscommented that I was not running UEFI then later followed up that I was.
It seems fairly clear to me that someone introduced an ACPI related bug in 4.5.0-25 and onwards. My system was always fine and always able to boot without adding 'acpi=off' until that version.
On Fri, Sep 22, 2017 at 10:06 AM, Marcin Juszkiewicz < marcin.juszkiewicz@linaro.org> wrote:
W dniu 22.09.2017 o 02:35, Jeremiah Rothschild pisze:
I'm happy I can boot now. If I can also stop the errors, that'd be great.
Seriously, check uefi firmware. I doubt that anyone tested CentOS 4.11 kernel on ARM server with U-Boot (or DT mode). _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
FWIW, I am running my own mainline LT 4.9.x kernel, and don't need acpi=off. As long as I don't add any PCIe cards, I don't get the errors like the ones you pasted. If you are interested, I'm more than happy to share my src.rpm for 4.9.x, but won't be able to get to it before tomorrow morning as the machine was recently mothballed.
Gordan
On Fri, Sep 22, 2017 at 11:34 AM, Jeremiah Rothschild jeremiah@franz.com wrote:
On Fri, Sep 22, 2017 at 08:14:03AM +0100, Gordan Bobic wrote:
Do you have any PCIe cards in yours?
Actually, no.
What is DT mode?
From my understanding of the thread so far, both Jeremiah and I are
running
Tianocore UEFI firmware, chain loaded from u-boot.
That's right. I originally miscommented that I was not running UEFI then later followed up that I was.
It seems fairly clear to me that someone introduced an ACPI related bug in 4.5.0-25 and onwards. My system was always fine and always able to boot without adding 'acpi=off' until that version.
On Fri, Sep 22, 2017 at 10:06 AM, Marcin Juszkiewicz < marcin.juszkiewicz@linaro.org> wrote:
W dniu 22.09.2017 o 02:35, Jeremiah Rothschild pisze:
I'm happy I can boot now. If I can also stop the errors, that'd be
great.
Seriously, check uefi firmware. I doubt that anyone tested CentOS 4.11 kernel on ARM server with U-Boot (or DT mode). _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
On Fri, Sep 22, 2017 at 11:39:19AM +0100, Gordan Bobic wrote:
FWIW, I am running my own mainline LT 4.9.x kernel, and don't need acpi=off.
Interesting.
For the record, I don't mind running with acpi=off, but it's very weird to me that new kernels suddenly stopped working without it.
If you are interested, I'm more than happy to share my src.rpm for 4.9.x, but won't be able to get to it before tomorrow morning as the machine was recently mothballed.
Thanks. I actually need to test with as new of a version as I can because I have been experiencing an occasional "page allocation failure" kernel panic. No idea if/when that was fixed but I figure the newest version is my best hope.
Gordan
On Fri, Sep 22, 2017 at 11:34 AM, Jeremiah Rothschild jeremiah@franz.com wrote:
On Fri, Sep 22, 2017 at 08:14:03AM +0100, Gordan Bobic wrote:
Do you have any PCIe cards in yours?
Actually, no.
What is DT mode?
From my understanding of the thread so far, both Jeremiah and I are
running
Tianocore UEFI firmware, chain loaded from u-boot.
That's right. I originally miscommented that I was not running UEFI then later followed up that I was.
It seems fairly clear to me that someone introduced an ACPI related bug in 4.5.0-25 and onwards. My system was always fine and always able to boot without adding 'acpi=off' until that version.
On Fri, Sep 22, 2017 at 10:06 AM, Marcin Juszkiewicz < marcin.juszkiewicz@linaro.org> wrote:
W dniu 22.09.2017 o 02:35, Jeremiah Rothschild pisze:
I'm happy I can boot now. If I can also stop the errors, that'd be
great.
Seriously, check uefi firmware. I doubt that anyone tested CentOS 4.11 kernel on ARM server with U-Boot (or DT mode). _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
On Fri, Sep 22, 2017 at 11:54 AM, Jeremiah Rothschild jeremiah@franz.com wrote:
On Fri, Sep 22, 2017 at 11:39:19AM +0100, Gordan Bobic wrote:
FWIW, I am running my own mainline LT 4.9.x kernel, and don't need
acpi=off.
Interesting.
For the record, I don't mind running with acpi=off, but it's very weird to me that new kernels suddenly stopped working without it.
From the fact that I maintain my own kernel builds, you may infer how much
faith I put in distro supplied kernels (any distro, not singling any specific distro out).
If you are interested, I'm more than happy to share my src.rpm for 4.9.x, but won't be able to get to it before tomorrow morning as the machine was recently mothballed.
Thanks. I actually need to test with as new of a version as I can because I have been experiencing an occasional "page allocation failure" kernel panic. No idea if/when that was fixed but I figure the newest version is my best hope.
I've been on my own 4.9.x more or less since I got the machine, it was in 24/7 use, and I never experienced that issue. So it may be worth a cross-check with the kernel that I'm running to see whether the fault follows your machine or whether it is kernel dependent.
On Fri, Sep 22, 2017 at 11:59:00AM +0100, Gordan Bobic wrote:
On Fri, Sep 22, 2017 at 11:54 AM, Jeremiah Rothschild jeremiah@franz.com wrote:
On Fri, Sep 22, 2017 at 11:39:19AM +0100, Gordan Bobic wrote:
If you are interested, I'm more than happy to share my src.rpm for 4.9.x, but won't be able to get to it before tomorrow morning as the machine was recently mothballed.
Thanks. I actually need to test with as new of a version as I can because I have been experiencing an occasional "page allocation failure" kernel panic. No idea if/when that was fixed but I figure the newest version is my best hope.
I've been on my own 4.9.x more or less since I got the machine, it was in 24/7 use, and I never experienced that issue. So it may be worth a cross-check with the kernel that I'm running to see whether the fault follows your machine or whether it is kernel dependent.
You're right. It would be a good extra data point. Feel free to mail me directly once you're sorted and I'll gladly check out your 4.9 build. Thanks again!
Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev
On 09/22/2017 03:34 AM, Jeremiah Rothschild wrote:
On Fri, Sep 22, 2017 at 08:14:03AM +0100, Gordan Bobic wrote:
Do you have any PCIe cards in yours?
Actually, no.
What is DT mode?
From my understanding of the thread so far, both Jeremiah and I are running
Tianocore UEFI firmware, chain loaded from u-boot.
That's right. I originally miscommented that I was not running UEFI then later followed up that I was.
Keep in mind that this doesn't mean the UEFI being used is getting the right EFI variables populated on its journey through uboot to a running system. Not all UEFI is equal.
It seems fairly clear to me that someone introduced an ACPI related bug in 4.5.0-25 and onwards. My system was always fine and always able to boot without adding 'acpi=off' until that version.
The OS is built to expect ACPI. That reliance will increase as it progresses. As has been suggested previously, you should really consider moving to a supported boot platform rather than chainloading. There are a number of improvements in the new official AMI firmwares that the distro takes advantage of.
I don't see this as an ACPI bug, but rather a fix that closes bugs you were using as features until the more recent kernels.
On Fri, Sep 22, 2017 at 09:02:59AM -0700, Jim Perrin wrote:
On 09/22/2017 03:34 AM, Jeremiah Rothschild wrote:
That's right. I originally miscommented that I was not running UEFI then later followed up that I was.
Keep in mind that this doesn't mean the UEFI being used is getting the right EFI variables populated on its journey through uboot to a running system. Not all UEFI is equal.
The OS is built to expect ACPI. That reliance will increase as it progresses. As has been suggested previously, you should really consider moving to a supported boot platform rather than chainloading. There are a number of improvements in the new official AMI firmwares that the distro takes advantage of.
Thanks for chiming in, Jim.
I was not aware that the chainloading boot method was unsupported or problematic.
Are there good docs out there to help me understand the process of replacing U-Boot with a supported method? I assume I won't need to reinstall my OS, just some sort of reflashing then pointing to GRUB?
On 09/22/2017 11:19 AM, Jeremiah Rothschild wrote:
On Fri, Sep 22, 2017 at 09:02:59AM -0700, Jim Perrin wrote:
Thanks for chiming in, Jim.
I was not aware that the chainloading boot method was unsupported or problematic.
Are there good docs out there to help me understand the process of replacing U-Boot with a supported method? I assume I won't need to reinstall my OS, just some sort of reflashing then pointing to GRUB?
Phong Vo, who commented earlier (and is copied here) on could likely point you to the best docs. You may not need to reinstall, but you would almost certainly need to use efibootmgr to create the the proper entries for booting.
Jeremiah,
You will need to use UEFI firmware, either Gigabyte AMI BIOS or APM UEFI Tianocore, but the official version would be AMI BIOS which you should contact AMI for such.
-Phong
+-----Original Message----- +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of Jeremiah +Rothschild +Sent: Thursday, September 21, 2017 4:32 PM +To: Conversations around CentOS on ARM hardware +Subject: Re: [Arm-dev] Kernel problems on APM X-Gene + +On Thu, Sep 21, 2017 at 04:19:55PM +0700, Phong Vo wrote: +> Jeremiah, + +Hi Phong, thank you for the reply. + +> Is this on Mustang board? Which version of bootloader are you using? + +Pardon me for not being more specific. This is the Gigabyte MP30-AR0 +board. + +Using bootloader version: +U-Boot 2013.04-mp30ar0_sw_1.18.04 (Sep 02 2015 - 16:38:06) REV: F06b ( +uart0 ) + +> -Phong +> +> +-----Original Message----- +> +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of +> +Jeremiah Rothschild +> +Sent: Thursday, September 21, 2017 1:45 PM +> +To: arm-dev@centos.org +> +Subject: [Arm-dev] Kernel problems on APM X-Gene +> + +> +Hey everyone! +> + +> +I cannot get any kernel past 4.5.0-23.el7.aarch64 to work on APM +> +X-Gene hardware. This includes the new (CentOS 7.4) 4.11.0-22 kernel. +> + +> +The last thing ever displayed during boot is this: +> + +> +EFI stub: Booting Linux Kernel... +> +EFI stub: Using DTB from configuration table EFI stub: Exiting boot +> +services and installing virtual address map... +> +L3c Cache: 8MB +> + +> +which is usual. However, nothing is logged to /var/log/messages +> +(presumably it isn't making it far enough in the boot process) and +> +I'm not sure how to further troubleshoot the issue. +> + +> +Has anyone else been successful? Are there new patches I am unaware +of? +> +Can someone please help me debug deeper? +> + +> +Thanks in advance! +> + +> +j +> +_______________________________________________ +> +Arm-dev mailing list +> +Arm-dev@centos.org +> +https://lists.centos.org/mailman/listinfo/arm-dev +> _______________________________________________ +> Arm-dev mailing list +> Arm-dev@centos.org +> https://lists.centos.org/mailman/listinfo/arm-dev +_______________________________________________ +Arm-dev mailing list +Arm-dev@centos.org +https://lists.centos.org/mailman/listinfo/arm-dev
On Thu, Sep 21, 2017 at 05:09:05PM +0700, Phong Vo wrote:
Jeremiah,
You will need to use UEFI firmware, either Gigabyte AMI BIOS or APM UEFI Tianocore, but the official version would be AMI BIOS which you should contact AMI for such.
I apologize because I forgot that I daisy chained U-Boot to TianoCore UEFI already.
Here is my version:
TianoCore 1.20.03-uhp UEFI 2.4.0 Feb 22 2016 11:17:26
-Phong
+-----Original Message----- +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of Jeremiah +Rothschild +Sent: Thursday, September 21, 2017 4:32 PM +To: Conversations around CentOS on ARM hardware +Subject: Re: [Arm-dev] Kernel problems on APM X-Gene
+On Thu, Sep 21, 2017 at 04:19:55PM +0700, Phong Vo wrote: +> Jeremiah,
+Hi Phong, thank you for the reply.
+> Is this on Mustang board? Which version of bootloader are you using?
+Pardon me for not being more specific. This is the Gigabyte MP30-AR0 +board.
+Using bootloader version: +U-Boot 2013.04-mp30ar0_sw_1.18.04 (Sep 02 2015 - 16:38:06) REV: F06b ( +uart0 )
+> -Phong +> +> +-----Original Message----- +> +From: Arm-dev [mailto:arm-dev-bounces@centos.org] On Behalf Of +> +Jeremiah Rothschild +> +Sent: Thursday, September 21, 2017 1:45 PM +> +To: arm-dev@centos.org +> +Subject: [Arm-dev] Kernel problems on APM X-Gene +> + +> +Hey everyone! +> + +> +I cannot get any kernel past 4.5.0-23.el7.aarch64 to work on APM +> +X-Gene hardware. This includes the new (CentOS 7.4) 4.11.0-22 kernel. +> + +> +The last thing ever displayed during boot is this: +> + +> +EFI stub: Booting Linux Kernel... +> +EFI stub: Using DTB from configuration table EFI stub: Exiting boot +> +services and installing virtual address map... +> +L3c Cache: 8MB +> + +> +which is usual. However, nothing is logged to /var/log/messages +> +(presumably it isn't making it far enough in the boot process) and +> +I'm not sure how to further troubleshoot the issue. +> + +> +Has anyone else been successful? Are there new patches I am unaware +of? +> +Can someone please help me debug deeper? +> + +> +Thanks in advance! +> + +> +j +> +_______________________________________________ +> +Arm-dev mailing list +> +Arm-dev@centos.org +> +https://lists.centos.org/mailman/listinfo/arm-dev +> _______________________________________________ +> Arm-dev mailing list +> Arm-dev@centos.org +> https://lists.centos.org/mailman/listinfo/arm-dev +_______________________________________________ +Arm-dev mailing list +Arm-dev@centos.org +https://lists.centos.org/mailman/listinfo/arm-dev _______________________________________________ Arm-dev mailing list Arm-dev@centos.org https://lists.centos.org/mailman/listinfo/arm-dev