[CentOS-virt] Clocksource boot issues 4.9.13

Mon Apr 3 15:21:51 UTC 2017
Johnny Hughes <johnny at centos.org>

On 04/03/2017 04:20 AM, George Dunlap wrote:
> On Sun, Apr 2, 2017 at 9:22 PM, Sarah Newman <srn at prgmr.com> wrote:
>> On 04/02/2017 02:49 AM, Chris Elliott wrote:
>>> Hi all
>>>
>>> I’ve got a few Intel Z87 chipset machines with Adaptec 5405 raid cards (latest firmware), they work fine on 3.18 but during Dom0 boot using kernel
>>> 4.9.13 it hangs at “Using clocksource tsc” and the aacraid driver keeps trying to reset
>>>
>>> Has anyone seen anything like this?
>>>
>>> I’ve tried specifying clocksource=xen in grub instead of the default of tsc, and that has the same issue. HPET is enabled and Xen is seeing it:
>>>
>>> (XEN) ACPI: HPET D9649CB0, 0038 (r1 ALASKA    A M I  1072009 AMI.        5) (XEN) Platform timer is 14.318MHz HPET
>>
>> I saw a hang at a similar place in the boot process when trying to boot xen-on-xen for our test system. On a hunch I was going to to try recompiling
>> without the PVHVM PCI related driver (pci-platform ? platform-pci ? ) before saying anything about it.
>>
>> Since you tried changing the clock source I'm wondering is that the boot issue is unrelated to the clock source, in which case you may get a better
>> idea of what's hanging by comparing the boot logs from 3.18 to 4.9 and seeing what's present in 3.18 but not in 4.9. Presumably the messages in the
>> 3.18 but not 4.9 logs are either removed from the kernel source or happen after whatever is hanging.
> 
> The Xen-on-xen thing is a specific problem with nested Xen; I asked on
> xen-devel and was pointed to this commit.
> 
> Unfortunately it's pretty unlikely this one will help Chris.
> 
> But perhaps, Chris, if you follow my example and post a bug report to
> xen-devel (with serial output from Xen and the guest kernel), someone
> may be able to find a patch which fixes the problem.
> 

I have a test kernel that fixes the xen on xen issue:

https://people.centos.org/hughesjr/4.9.16/

there are 6 and 7 4.9.20 kernels in there .. give them a try.  (I know,
the directory say 4.9.16 and the newest kernels are 4.9.20 .. but it is
going away once we build them for real :D)

Also for chris:  try these parameters on the vmlinux line:

clocksource=tsc tsc=reliable

Thanks,
Johnny Hughes


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170403/25d2ad14/attachment-0005.sig>