[CentOS-virt] KVM instance keep crashing

Fri Oct 15 09:57:17 UTC 2010
Poh Yong Hwang <yongsan at gmail.com>


The message log belongs to the guest which will become unresponsive from
time to time. I have done the following and it report the same both on host
as well as guest:

[root at localhost conf]# cat /proc/sys/kernel/tainted


On Fri, Oct 15, 2010 at 1:27 AM, Eric Searcy <emsearcy at gmail.com> wrote:

> On Oct 14, 2010, at 1:38 AM, Poh Yong Hwang wrote:
> > Hi,
> >
> > I have one KVM instance (centos 5) that keeps crashing and i see the
> message log with the following:
> >
> > Oct 14 16:24:48 localhost kernel: psmouse.c: Explorer Mouse at
> isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
> > Oct 14 16:24:49 localhost kernel: BUG: soft lockup - CPU#0 stuck for 12s!
> [ntpd:2363]
> > Oct 14 16:24:49 localhost kernel: CPU 0:
> > Oct 14 16:24:49 localhost kernel: Modules linked in: backupdriver(PU)
> ipv6 xfrm_nalgo crypto_api autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
> talpa_pedevice(U) dm_mirror dm_multipath scsi_dh video backlight sbs
> power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi
> acpi_memhotplug ac parport_pc lp parport floppy virtio_balloon virtio_pci
> ide_cd i2c_piix4 virtio_ring 8139too cdrom 8139cp pcspkr i2c_core virtio mii
> serio_raw dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache
> ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
> > Oct 14 16:24:49 localhost kernel: Pid: 2363, comm: ntpd Tainted: P
>  2.6.18-194.3.1.el5 #1
> [...]
> > Afterwhich the instance become very sluggish and unresponsive. Please
> advise what could be the issue.
> I'm no expert on kernel stuff, but I thought I'd throw in a couple
> suggested points of clarification on your request since the above is not
> clear to me.
> Is the above in /var/log/message on the guest or host?
> Is it always an "ntpd" process on the CPU#0 stuck/soft lockup line?  Does
> the soft lockup always occur after a psmouse.c warning?  (Even so, the
> psmouse.c warning could maybe be a symptom of the CPU being stuck, not the
> cause...)
> What type of hardware is this?  Noticing that is says "tainted" and I'm
> assuming this is the kernel (as I have no idea how a userland process, ntpd,
> could be "tainted"!), then you have a binary-distributed kernel module and
> you should probably try with that unloaded to see if the issue goes away.
>  It could be a machine check error, but that's less likely I think.  To
> double check, run the following in both the host and guest:
> cat /proc/sys/kernel/tainted
> This ORed value can be checked against the flags given in
> http://www.kernel.org/doc/Documentation/sysctl/kernel.txt
> Eric
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> http://lists.centos.org/mailman/listinfo/centos-virt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20101015/00bc705b/attachment-0005.html>