[CentOS-virt] BUG: soft lockup detected on CPU#?

Sun Jan 20 00:29:38 UTC 2008
Brett Worth <brett at worth.id.au>

Hello All.

I've just started looking into Xen and have a test environment in place.  I'm seeing an
annoying problem that I thought worthy of a post.


I have 2 x HP DL585 servers each with 4 Dual core Opterons (non-vmx) and 16GB RAM
configured as Xen servers.  These run CentOS 5.1 with the latest updates applied.  These
system both attach to an iSCSI target which is an HP DL385 running ietd and serving SAN
based storage.

I have a test VM running CentOS 5.1 also updated.


If I run the VM on a single server everything is OK.  If I do a migrate of the VM to the
other server I start getting random "BUG: soft lockup detected on CPU#?" messages on the
VM console.  The messages seem to happen with IO but not every time.  A reboot of the VM
on the new server will stop these messages.

I've also left the VM running overnight a couple of times and when I do I find that any
external sessions (ssh) are hung in the morning but the console session is not.  New ssh
sessions can be started and seem to work.

After much googling it looks like the kernel messages can occur if dom0 is very busy but
mine is not.

Any suggestions?


Brett Worth