[CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot

Fri Mar 14 14:46:40 UTC 2014
Karl Johnson <karljohnson.it at gmail.com>

I don't think it's related to microcode_ctl as my server was rebooting just
fine before switching to Xen4CentOS in December 2013, then the reboot issue
started. Note that it's not only on Dell servers but also Supermicro so I
don't think it's hardware related too.

Karl


On Thu, Mar 13, 2014 at 8:14 PM, PJ Welsh <pjwelsh at gmail.com> wrote:

> Anyone think it's related to microcode_ctl as noted in
> http://lists.us.dell.com/pipermail/linux-poweredge/2013-October/048538.html?
> I will see about testing for this issue ASAP...
>
> pjwelsh
>
>
> On Thu, Mar 13, 2014 at 6:58 PM, PJ Welsh <pjwelsh at gmail.com> wrote:
>
>> Comments at bottom:
>>
>>
>> On Tue, Mar 11, 2014 at 12:32 PM, Phillippe Welsh <pjwelsh at gmail.com>wrote:
>>
>>> Comments inline:
>>>
>>> ----- Original Message -----
>>> > From: "Pasi Kärkkäinen" <pasik at iki.fi>
>>> > To: "Discussion about the virtualization on CentOS" <
>>> centos-virt at centos.org>
>>> > Sent: Sunday, March 9, 2014 4:32:13 AM
>>> > Subject: Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
>>> >
>>> > On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
>>> > >    No, I have not followed those instructions yet. These were
>>> > >    production
>>> > >    servers that I had scheduled firmware updates late Sunday
>>> > >    evening. The
>>> > >    first time I though the error was a fluke and only began to
>>> > >    research it
>>> > >    after the second failure (and still no firmware updates due to
>>> > >    the
>>> > >    power-cycle). I may try to sneak in a restart of one of the
>>> > >    systems late
>>> > >    Sunday night US CT.
>>> > >
>>> >
>>> > OK.
>>>
>>> I ran the "stop" for all of the xen related pieces in the order that the
>>> /etc/rc3.d/ had them.
>>> The VM's did not shutdown and the /usr/lib64/xen/bin/qemu-dm STUFF
>>> entries were left behind running.
>>> Since I could not xm shutdown any longer, I killed off all qemu-dm
>>> proccesses and attempted a reboot...
>>> HUNG on the reboot with the prepended umount error messages...
>>>
>>> >
>>> > >    Still not sure why the running vm's would stop the reboot... The
>>> > >    server
>>> > >    shows that it was suppose to be restarting. I have had a similar
>>> > >    stuck on
>>> > >    restarting message (minus all the umount errors) on some Dell
>>> > >    T105's
>>> > >    running CentOS 6.5 and the "reboot=pci" grub.conf kernel option
>>> > >    is what
>>> > >    ended up working for them. I have not tested that possible
>>> > >    option yet,
>>> > >    either since that would take 2 reboots to put into place.
>>> > >
>>> >
>>> > Yeah, it's worth testing both, to figure out what's wrong.
>>>
>>> Next reboot attempt included the "reboot=pci" grub.conf kernel option...
>>> No affect :(
>>> HUNG on the reboot with the prepended umount error messages...
>>>
>>> I ran out of time to attempt an xm shutdown for each VM manually, then
>>> reboot.
>>>
>>> What's interesting is that when I do an lsof on the file system that is
>>> unable to umount, the *only* connected PID's are the qemu-dm ones, but not
>>> *all* of them.???
>>>
>>> Thanks
>>>
>>> PJ
>>> ...
>>>
>>
>> UPDATE: I cleanly shut down *all* vm's and unmounted the filesystem that
>> had the umount issue noted previously and then issued the reboot command.
>> *STILL* the Dell R710 will be hung at the rebooting line.
>>
>> No reboot possible on 2 Dell R710's with at least the 2 most recent
>> CentOSXen4 kernels.
>>
>> Any other suggestions?
>>
>> Thanks
>>
>> pjwelsh
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20140314/ad4187ab/attachment-0005.html>