[CentOS-virt] Clocksource boot issues 4.9.13

Fri Apr 7 10:00:48 UTC 2017
Chris Elliott <chris at chriselliott.info>

I’ve tested the 4.9.20-26 kernel on various CentOS 6 systems and seems to work fine, also with xen 4.6.3-12 which was just released

Still can’t get one particular server to boot on 4.9 without hanging, but strangely another identical server with the same bios version works fine… so I’m thinking its bad hardware. Odd that previous 3.4, 3.10 and 3.14 kernels had no issues

Chris

On 03/04/2017, 16:21, "CentOS-virt on behalf of Johnny Hughes" <centos-virt-bounces at centos.org on behalf of johnny at centos.org> wrote:

    On 04/03/2017 04:20 AM, George Dunlap wrote:
    > On Sun, Apr 2, 2017 at 9:22 PM, Sarah Newman <srn at prgmr.com> wrote:
    >> On 04/02/2017 02:49 AM, Chris Elliott wrote:
    >>> Hi all
    >>>
    >>> I’ve got a few Intel Z87 chipset machines with Adaptec 5405 raid cards (latest firmware), they work fine on 3.18 but during Dom0 boot using kernel
    >>> 4.9.13 it hangs at “Using clocksource tsc” and the aacraid driver keeps trying to reset
    >>>
    >>> Has anyone seen anything like this?
    >>>
    >>> I’ve tried specifying clocksource=xen in grub instead of the default of tsc, and that has the same issue. HPET is enabled and Xen is seeing it:
    >>>
    >>> (XEN) ACPI: HPET D9649CB0, 0038 (r1 ALASKA    A M I  1072009 AMI.        5) (XEN) Platform timer is 14.318MHz HPET
    >>
    >> I saw a hang at a similar place in the boot process when trying to boot xen-on-xen for our test system. On a hunch I was going to to try recompiling
    >> without the PVHVM PCI related driver (pci-platform ? platform-pci ? ) before saying anything about it.
    >>
    >> Since you tried changing the clock source I'm wondering is that the boot issue is unrelated to the clock source, in which case you may get a better
    >> idea of what's hanging by comparing the boot logs from 3.18 to 4.9 and seeing what's present in 3.18 but not in 4.9. Presumably the messages in the
    >> 3.18 but not 4.9 logs are either removed from the kernel source or happen after whatever is hanging.
    > 
    > The Xen-on-xen thing is a specific problem with nested Xen; I asked on
    > xen-devel and was pointed to this commit.
    > 
    > Unfortunately it's pretty unlikely this one will help Chris.
    > 
    > But perhaps, Chris, if you follow my example and post a bug report to
    > xen-devel (with serial output from Xen and the guest kernel), someone
    > may be able to find a patch which fixes the problem.
    > 
    
    I have a test kernel that fixes the xen on xen issue:
    
    https://people.centos.org/hughesjr/4.9.16/
    
    there are 6 and 7 4.9.20 kernels in there .. give them a try.  (I know,
    the directory say 4.9.16 and the newest kernels are 4.9.20 .. but it is
    going away once we build them for real :D)
    
    Also for chris:  try these parameters on the vmlinux line:
    
    clocksource=tsc tsc=reliable
    
    Thanks,
    Johnny Hughes