[CentOS-virt] BUG: soft lockup detected on CPU#?
Eli Stair
estair at ilm.com
Mon Jan 21 19:11:19 UTC 2008
My un-authoritative answer: I've been tracking this bug (or several with the
same symptoms) for going on a couple years. It's ridiculously common,
apparently well known to the Xen/Xensource guys judging by the number of
reports/bugs posted, but I haven't seen mention of it actually being addressed
and resolved. Unfortunately I see the same issue with it cropping up after VM
moves, though it occurs /every/ time there is a VM migration, once per
processor in the VM; doesn't matter if there is any IO on the Dom0 or DomU.
Occasionally VM's die during a migration and have to be manually
destroyed/restarted.
I do see evidence of significant instability (not implying it is related to the
above softlockup issues) however, in either VM moves migrating from a Xeon
(5345) to Opteron Dom0, and in high-utilization DomU's which are just plain
flaky and reboot/die semi-frequently even when never altered from their start Dom0.
For me, it currently means running only low-priority non-production services in
a VM, and not shelling out for RHEL5 support for the project (contrary to what
I planned) since it's not being addressed. I'd be curious if this is being
addressed in the Xen 3.2 release for RHEL5*...
Cheers,
/eli
Brett Worth wrote:
> Hello All.
>
> I've just started looking into Xen and have a test environment in
> place. I'm seeing an
> annoying problem that I thought worthy of a post.
>
> Config:
>
> I have 2 x HP DL585 servers each with 4 Dual core Opterons (non-vmx) and
> 16GB RAM
> configured as Xen servers. These run CentOS 5.1 with the latest updates
> applied. These
> system both attach to an iSCSI target which is an HP DL385 running ietd
> and serving SAN
> based storage.
>
> I have a test VM running CentOS 5.1 also updated.
>
> Problem:
>
> If I run the VM on a single server everything is OK. If I do a migrate
> of the VM to the
> other server I start getting random "BUG: soft lockup detected on CPU#?"
> messages on the
> VM console. The messages seem to happen with IO but not every time. A
> reboot of the VM
> on the new server will stop these messages.
>
> I've also left the VM running overnight a couple of times and when I do
> I find that any
> external sessions (ssh) are hung in the morning but the console session
> is not. New ssh
> sessions can be started and seem to work.
>
> After much googling it looks like the kernel messages can occur if dom0
> is very busy but
> mine is not.
>
> Any suggestions?
>
> Regards
>
> Brett Worth
>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> http://lists.centos.org/mailman/listinfo/centos-virt
>
More information about the CentOS-virt
mailing list