I don't think it's related to microcode_ctl as my server was rebooting just fine before switching to Xen4CentOS in December 2013, then the reboot issue started. Note that it's not only on Dell servers but also Supermicro so I don't think it's hardware related too.
Karl
On Thu, Mar 13, 2014 at 8:14 PM, PJ Welsh pjwelsh@gmail.com wrote:
Anyone think it's related to microcode_ctl as noted in http://lists.us.dell.com/pipermail/linux-poweredge/2013-October/048538.html? I will see about testing for this issue ASAP...
pjwelsh
On Thu, Mar 13, 2014 at 6:58 PM, PJ Welsh pjwelsh@gmail.com wrote:
Comments at bottom:
On Tue, Mar 11, 2014 at 12:32 PM, Phillippe Welsh pjwelsh@gmail.comwrote:
Comments inline:
----- Original Message -----
From: "Pasi Kärkkäinen" pasik@iki.fi To: "Discussion about the virtualization on CentOS" <
centos-virt@centos.org>
Sent: Sunday, March 9, 2014 4:32:13 AM Subject: Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
OK.
I ran the "stop" for all of the xen related pieces in the order that the /etc/rc3.d/ had them. The VM's did not shutdown and the /usr/lib64/xen/bin/qemu-dm STUFF entries were left behind running. Since I could not xm shutdown any longer, I killed off all qemu-dm proccesses and attempted a reboot... HUNG on the reboot with the prepended umount error messages...
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
Yeah, it's worth testing both, to figure out what's wrong.
Next reboot attempt included the "reboot=pci" grub.conf kernel option... No affect :( HUNG on the reboot with the prepended umount error messages...
I ran out of time to attempt an xm shutdown for each VM manually, then reboot.
What's interesting is that when I do an lsof on the file system that is unable to umount, the *only* connected PID's are the qemu-dm ones, but not *all* of them.???
Thanks
PJ ...
UPDATE: I cleanly shut down *all* vm's and unmounted the filesystem that had the umount issue noted previously and then issued the reboot command. *STILL* the Dell R710 will be hung at the rebooting line.
No reboot possible on 2 Dell R710's with at least the 2 most recent CentOSXen4 kernels.
Any other suggestions?
Thanks
pjwelsh