Thanks Johnny, and sorry for the top post (blackberry). I d/l'd the src rpm and found their patches all in 1 patch file called xen.patch (I did an ls -lt and picked the files with the latest timestamps). There may also be kernel config changes as several config files were touched, but I couldn't get a hold of the original 8.1.8 src rpm to diff them. I would be happy to help in getting the parts needed so they can be rolled up into a single patch to apply to the current plus kernel. Just let me know what you need. I wonder if anybody at XenSource notified upstream of the fixes? -Ross ----- Original Message ----- From: centos-devel-bounces at centos.org <centos-devel-bounces at centos.org> To: The CentOS developers mailing list. <centos-devel at centos.org> Sent: Wed Jan 23 07:37:04 2008 Subject: Re: [CentOS-devel] RE: [CentOS-virt] BUG: soft lockup detected onCPU#? Ross S. W. Walker wrote: > Ross S. W. Walker wrote: >> Brett Worth wrote: >>> Hello All. >>> >>> I've just started looking into Xen and have a test >>> environment in place. I'm seeing an >>> annoying problem that I thought worthy of a post. >>> >>> Config: >>> >>> I have 2 x HP DL585 servers each with 4 Dual core Opterons >>> (non-vmx) and 16GB RAM >>> configured as Xen servers. These run CentOS 5.1 with the >>> latest updates applied. These >>> system both attach to an iSCSI target which is an HP DL385 >>> running ietd and serving SAN >>> based storage. >>> >>> I have a test VM running CentOS 5.1 also updated. >>> >>> Problem: >>> >>> If I run the VM on a single server everything is OK. If I do >>> a migrate of the VM to the >>> other server I start getting random "BUG: soft lockup >>> detected on CPU#?" messages on the >>> VM console. The messages seem to happen with IO but not >>> every time. A reboot of the VM >>> on the new server will stop these messages. >>> >>> I've also left the VM running overnight a couple of times and >>> when I do I find that any >>> external sessions (ssh) are hung in the morning but the >>> console session is not. New ssh >>> sessions can be started and seem to work. >>> >>> After much googling it looks like the kernel messages can >>> occur if dom0 is very busy but >>> mine is not. >>> >>> Any suggestions? >> The soft lockup is technically not a BUG. >> >> You will see these errors if an IRQ takes more then 10 seconds >> to respond. >> >> In your case I would take a look at your iSCSI setup and the >> time it takes to migrate the VM from one node to another along >> with SCSI reserve/release setup on the iSCSI target. >> >> I also have been using the Xen 3.2 RPMs off xen.org to CentOS >> 5.1 which good results, the VM migration may run smoother and >> quicker in Xen 3.2, but in doing so you take Xen off the >> reservation, if your OK with that it may fix your issues. > > After seeing this same issue on my Xen 3.2 install, but with NO > migration or iSCSI happening I decided it is probably NOT iSCSI's > fault, so I decided to research it a little more and this is what > I found: > > http://docs.xensource.com/XenServer/4.0.1/guest/ch04s08.html#rhel5_limitations > > XenSource does provide a repo of CentOS 5 kernels that have been > patched to fix this though: > > http://updates.xensource.com/XenServer/4.0.1/centos5x/ > > But these seem to be woefully out of date. > > I wonder if a kind soul would add the fix to the centosplus kernel > with XenSource's patch so those rogue Xen users could benefit from > this fix until upstream decides to include it. > > I suppose the centosplus patch would need to be flagged interm in > case it needs removed when upstream has their own fix. Ross, Thanks for researching this. I can probably add this to the next centosplus kernels, though I usually do not like to add patches ... and I will need to grab their kernels and work out what is patched and try to roll it into our kernels. -- Johnny Hughes ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-devel/attachments/20080123/bc808ef0/attachment-0007.html>