Subject: Re: [CentOS-virt] Xen4CentOS kernel panic on dom0 reboot
On Wed, Mar 5, 2014 at 10:17 AM, David Vrabel <david.vrabel at citrix.com>wrote:
On 05/03/14 15:09, Karl Johnson wrote:
I've been using Xen4CentOS for the last 3 months. It's working fine and dom0/domUs are stable but the server does a kernel panic when doing a reboot and the server has to be hard reset manually. It has kernel panic
on the 3 last reboot.
There is a xenbus device still present and during shutdown it is trying to set it to CLOSED but at this point xenstored isn't running and the xenbus write stalls.
Do you have VMs that are still running when you attempt a reboot? If so shutting them down will likely avoid this.
Can you provide the output of xenstore-ls prior to attempting a reboot?
I though Xen init.d scripts would stop all of them before rebooting? Here's the output of chkconfig and xenstore-ls:
http://pastebin.centos.org/8186/
Thanks,
Karl
I've got the "me-too" on the reboot hang issue for 2 different Dell R710's with xen-4.2.4-29.el6 and at least the kernels kernel-3.10.25-11.el6.centos.alt.x86_64 and kernel-3.10.23-11.el6.centos.alt.x86_64. I have not tried to reboot with the latest kernel-3.10.32-11.el6.centos.alt.x86_64 (if kernel even makes a difference). I *have* had dom0_mem=1024M,max:1024M option in place for all of them with only 6 VM's.
Any new suggestions?
pjwelsh
On Thu, Mar 06, 2014 at 01:54:22PM -0600, Phillippe Welsh wrote:
Subject: Re: [CentOS-virt] Xen4CentOS kernel panic on dom0 reboot
On Wed, Mar 5, 2014 at 10:17 AM, David Vrabel <david.vrabel at citrix.com>wrote:
On 05/03/14 15:09, Karl Johnson wrote:
I've been using Xen4CentOS for the last 3 months. It's working fine and dom0/domUs are stable but the server does a kernel panic when doing a reboot and the server has to be hard reset manually. It has kernel panic
on the 3 last reboot.
There is a xenbus device still present and during shutdown it is trying to set it to CLOSED but at this point xenstored isn't running and the xenbus write stalls.
Do you have VMs that are still running when you attempt a reboot? If so shutting them down will likely avoid this.
Can you provide the output of xenstore-ls prior to attempting a reboot?
I though Xen init.d scripts would stop all of them before rebooting? Here's the output of chkconfig and xenstore-ls:
http://pastebin.centos.org/8186/
Thanks,
Karl
I've got the "me-too" on the reboot hang issue for 2 different Dell R710's with xen-4.2.4-29.el6 and at least the kernels kernel-3.10.25-11.el6.centos.alt.x86_64 and kernel-3.10.23-11.el6.centos.alt.x86_64. I have not tried to reboot with the latest kernel-3.10.32-11.el6.centos.alt.x86_64 (if kernel even makes a difference). I *have* had dom0_mem=1024M,max:1024M option in place for all of them with only 6 VM's.
Any new suggestions?
So did you make sure all the VMs are shut down before trying to reboot dom0?
-- Pasi
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
On Fri, Mar 7, 2014 at 10:43 PM, Pasi Kärkkäinen pasik@iki.fi wrote:
On Thu, Mar 06, 2014 at 01:54:22PM -0600, Phillippe Welsh wrote:
Subject: Re: [CentOS-virt] Xen4CentOS kernel panic on dom0 reboot
On Wed, Mar 5, 2014 at 10:17 AM, David Vrabel <david.vrabel at
citrix.com>wrote:
On 05/03/14 15:09, Karl Johnson wrote:
I've been using Xen4CentOS for the last 3 months. It's working
fine and
dom0/domUs are stable but the server does a kernel panic when
doing a
reboot and the server has to be hard reset manually. It has kernel
panic
on the 3 last reboot.
There is a xenbus device still present and during shutdown it is
trying
to set it to CLOSED but at this point xenstored isn't running and the xenbus write stalls.
Do you have VMs that are still running when you attempt a reboot?
If so
shutting them down will likely avoid this.
Can you provide the output of xenstore-ls prior to attempting a
reboot?
I though Xen init.d scripts would stop all of them before rebooting?
Here's
the output of chkconfig and xenstore-ls:
http://pastebin.centos.org/8186/
Thanks,
Karl
I've got the "me-too" on the reboot hang issue for 2 different Dell
R710's with xen-4.2.4-29.el6 and at least the kernels kernel-3.10.25-11.el6.centos.alt.x86_64 and kernel-3.10.23-11.el6.centos.alt.x86_64. I have not tried to reboot with the latest kernel-3.10.32-11.el6.centos.alt.x86_64 (if kernel even makes a difference). I *have* had dom0_mem=1024M,max:1024M option in place for all of them with only 6 VM's.
Any new suggestions?
So did you make sure all the VMs are shut down before trying to reboot dom0?
-- Pasi
CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
OK.
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
Yeah, it's worth testing both, to figure out what's wrong.
-- Pasi
On Fri, Mar 7, 2014 at 10:43 PM, Pasi Kärkkäinen <[1]pasik@iki.fi> wrote:
On Thu, Mar 06, 2014 at 01:54:22PM -0600, Phillippe Welsh wrote: > > Subject: Re: [CentOS-virt] Xen4CentOS kernel panic on dom0 reboot > > On Wed, Mar 5, 2014 at 10:17 AM, David Vrabel <david.vrabel at [2]citrix.com>wrote: > > > > On 05/03/14 15:09, Karl Johnson wrote: > > > > I've been using Xen4CentOS for the last 3 months. It's working fine and > > > > dom0/domUs are stable but the server does a kernel panic when doing a > > > > reboot and the server has to be hard reset manually. It has kernel panic > > > on the 3 last reboot. > > > > > > There is a xenbus device still present and during shutdown it is trying > > > to set it to CLOSED but at this point xenstored isn't running and the > > > xenbus write stalls. > > > > > > Do you have VMs that are still running when you attempt a reboot? If so > > > shutting them down will likely avoid this. > > > > > > Can you provide the output of xenstore-ls prior to attempting a reboot? > > > > > > > > > > > I though Xen init.d scripts would stop all of them before rebooting? Here's > > the output of chkconfig and xenstore-ls: > > > > [3]http://pastebin.centos.org/8186/ > > > > Thanks, > > > > Karl > > I've got the "me-too" on the reboot hang issue for 2 different Dell R710's with xen-4.2.4-29.el6 and at least the kernels kernel-[4]3.10.25-11.el6.centos.alt.x86_64 and kernel-3.10.23-11.el6.centos.alt.x86_64. I have not tried to reboot with the latest kernel-3.10.32-11.el6.centos.alt.x86_64 (if kernel even makes a difference). I *have* had dom0_mem=1024M,max:1024M option in place for all of them with only 6 VM's. > > Any new suggestions? > So did you make sure all the VMs are shut down before trying to reboot dom0? -- Pasi _______________________________________________ CentOS-virt mailing list [5]CentOS-virt@centos.org [6]http://lists.centos.org/mailman/listinfo/centos-virt
References
Visible links
- mailto:pasik@iki.fi
- http://citrix.com/
- http://pastebin.centos.org/8186/
- file:///tmp/tel:3.10.25-11
- mailto:CentOS-virt@centos.org
- http://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Comments inline:
----- Original Message -----
From: "Pasi Kärkkäinen" pasik@iki.fi To: "Discussion about the virtualization on CentOS" centos-virt@centos.org Sent: Sunday, March 9, 2014 4:32:13 AM Subject: Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
OK.
I ran the "stop" for all of the xen related pieces in the order that the /etc/rc3.d/ had them. The VM's did not shutdown and the /usr/lib64/xen/bin/qemu-dm STUFF entries were left behind running. Since I could not xm shutdown any longer, I killed off all qemu-dm proccesses and attempted a reboot... HUNG on the reboot with the prepended umount error messages...
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
Yeah, it's worth testing both, to figure out what's wrong.
Next reboot attempt included the "reboot=pci" grub.conf kernel option... No affect :( HUNG on the reboot with the prepended umount error messages...
I ran out of time to attempt an xm shutdown for each VM manually, then reboot.
What's interesting is that when I do an lsof on the file system that is unable to umount, the *only* connected PID's are the qemu-dm ones, but not *all* of them.???
Thanks
PJ ...
Comments at bottom:
On Tue, Mar 11, 2014 at 12:32 PM, Phillippe Welsh pjwelsh@gmail.com wrote:
Comments inline:
----- Original Message -----
From: "Pasi Kärkkäinen" pasik@iki.fi To: "Discussion about the virtualization on CentOS" <
centos-virt@centos.org>
Sent: Sunday, March 9, 2014 4:32:13 AM Subject: Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
OK.
I ran the "stop" for all of the xen related pieces in the order that the /etc/rc3.d/ had them. The VM's did not shutdown and the /usr/lib64/xen/bin/qemu-dm STUFF entries were left behind running. Since I could not xm shutdown any longer, I killed off all qemu-dm proccesses and attempted a reboot... HUNG on the reboot with the prepended umount error messages...
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
Yeah, it's worth testing both, to figure out what's wrong.
Next reboot attempt included the "reboot=pci" grub.conf kernel option... No affect :( HUNG on the reboot with the prepended umount error messages...
I ran out of time to attempt an xm shutdown for each VM manually, then reboot.
What's interesting is that when I do an lsof on the file system that is unable to umount, the *only* connected PID's are the qemu-dm ones, but not *all* of them.???
Thanks
PJ ...
UPDATE: I cleanly shut down *all* vm's and unmounted the filesystem that had the umount issue noted previously and then issued the reboot command. *STILL* the Dell R710 will be hung at the rebooting line.
No reboot possible on 2 Dell R710's with at least the 2 most recent CentOSXen4 kernels.
Any other suggestions?
Thanks
pjwelsh
Anyone think it's related to microcode_ctl as noted in http://lists.us.dell.com/pipermail/linux-poweredge/2013-October/048538.html? I will see about testing for this issue ASAP...
pjwelsh
On Thu, Mar 13, 2014 at 6:58 PM, PJ Welsh pjwelsh@gmail.com wrote:
Comments at bottom:
On Tue, Mar 11, 2014 at 12:32 PM, Phillippe Welsh pjwelsh@gmail.comwrote:
Comments inline:
----- Original Message -----
From: "Pasi Kärkkäinen" pasik@iki.fi To: "Discussion about the virtualization on CentOS" <
centos-virt@centos.org>
Sent: Sunday, March 9, 2014 4:32:13 AM Subject: Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote:
No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT.
OK.
I ran the "stop" for all of the xen related pieces in the order that the /etc/rc3.d/ had them. The VM's did not shutdown and the /usr/lib64/xen/bin/qemu-dm STUFF entries were left behind running. Since I could not xm shutdown any longer, I killed off all qemu-dm proccesses and attempted a reboot... HUNG on the reboot with the prepended umount error messages...
Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the "reboot=pci" grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place.
Yeah, it's worth testing both, to figure out what's wrong.
Next reboot attempt included the "reboot=pci" grub.conf kernel option... No affect :( HUNG on the reboot with the prepended umount error messages...
I ran out of time to attempt an xm shutdown for each VM manually, then reboot.
What's interesting is that when I do an lsof on the file system that is unable to umount, the *only* connected PID's are the qemu-dm ones, but not *all* of them.???
Thanks
PJ ...
UPDATE: I cleanly shut down *all* vm's and unmounted the filesystem that had the umount issue noted previously and then issued the reboot command. *STILL* the Dell R710 will be hung at the rebooting line.
No reboot possible on 2 Dell R710's with at least the 2 most recent CentOSXen4 kernels.
Any other suggestions?
Thanks
pjwelsh