Thanks Johnny, and sorry for the top post (blackberry).
I d/l'd the src rpm and found their patches all in 1 patch file called xen.patch (I did an ls -lt and picked the files with the latest timestamps). There may also be kernel config changes as several config files were touched, but I couldn't get a hold of the original 8.1.8 src rpm to diff them.
I would be happy to help in getting the parts needed so they can be rolled up into a single patch to apply to the current plus kernel. Just let me know what you need.
I wonder if anybody at XenSource notified upstream of the fixes?
-Ross
----- Original Message ----- From: centos-devel-bounces@centos.org centos-devel-bounces@centos.org To: The CentOS developers mailing list. centos-devel@centos.org Sent: Wed Jan 23 07:37:04 2008 Subject: Re: [CentOS-devel] RE: [CentOS-virt] BUG: soft lockup detected onCPU#?
Ross S. W. Walker wrote:
Ross S. W. Walker wrote:
Brett Worth wrote:
Hello All.
I've just started looking into Xen and have a test environment in place. I'm seeing an annoying problem that I thought worthy of a post.
Config:
I have 2 x HP DL585 servers each with 4 Dual core Opterons (non-vmx) and 16GB RAM configured as Xen servers. These run CentOS 5.1 with the latest updates applied. These system both attach to an iSCSI target which is an HP DL385 running ietd and serving SAN based storage.
I have a test VM running CentOS 5.1 also updated.
Problem:
If I run the VM on a single server everything is OK. If I do a migrate of the VM to the other server I start getting random "BUG: soft lockup detected on CPU#?" messages on the VM console. The messages seem to happen with IO but not every time. A reboot of the VM on the new server will stop these messages.
I've also left the VM running overnight a couple of times and when I do I find that any external sessions (ssh) are hung in the morning but the console session is not. New ssh sessions can be started and seem to work.
After much googling it looks like the kernel messages can occur if dom0 is very busy but mine is not.
Any suggestions?
The soft lockup is technically not a BUG.
You will see these errors if an IRQ takes more then 10 seconds to respond.
In your case I would take a look at your iSCSI setup and the time it takes to migrate the VM from one node to another along with SCSI reserve/release setup on the iSCSI target.
I also have been using the Xen 3.2 RPMs off xen.org to CentOS 5.1 which good results, the VM migration may run smoother and quicker in Xen 3.2, but in doing so you take Xen off the reservation, if your OK with that it may fix your issues.
After seeing this same issue on my Xen 3.2 install, but with NO migration or iSCSI happening I decided it is probably NOT iSCSI's fault, so I decided to research it a little more and this is what I found:
http://docs.xensource.com/XenServer/4.0.1/guest/ch04s08.html#rhel5_limitatio...
XenSource does provide a repo of CentOS 5 kernels that have been patched to fix this though:
http://updates.xensource.com/XenServer/4.0.1/centos5x/
But these seem to be woefully out of date.
I wonder if a kind soul would add the fix to the centosplus kernel with XenSource's patch so those rogue Xen users could benefit from this fix until upstream decides to include it.
I suppose the centosplus patch would need to be flagged interm in case it needs removed when upstream has their own fix.
Ross,
Thanks for researching this.
I can probably add this to the next centosplus kernels, though I usually do not like to add patches ... and I will need to grab their kernels and work out what is patched and try to roll it into our kernels.
-- Johnny Hughes
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.