[CentOS] BUG: soft lockup - CPU#1 stuck for 61s!

Sat Jul 26 16:45:06 UTC 2008
Johnny Hughes <johnny at centos.org>

Ian jonhson wrote:
> Dear All,
> 
> I selected CentOS5 in my works and installed them in two DELL PowerEdge1950.
> However, a trouble blocked me during the machines run after two days.
> The machines
> crashed and the syslog said it got the following messages:
> 
> ------------------  part dump of /var/log/messages  ----------------------
> 
> .....
> Jul 25 02:15:02 vega2008 kernel:  [<c045bc58>] ? exit_mmap+0x93/0xc9
> Jul 25 02:15:02 vega2008 kernel:  [<c04214c2>] ? mmput+0x25/0x68
> Jul 25 02:15:02 vega2008 kernel:  [<c046e9c9>] ? flush_old_exec+0x4f8/0x777
> Jul 25 02:15:02 vega2008 kernel:  [<c046dfcf>] ? kernel_read+0x32/0x43
> Jul 25 02:15:02 vega2008 kernel:  [<c0490e60>] ? load_elf_binary+0x359/0x1152
> Jul 25 02:15:02 vega2008 kernel:  [<c045a6ee>] ? get_user_pages+0x2d5/0x35c
> Jul 25 02:15:02 vega2008 kernel:  [<c04570d2>] ? page_address+0x78/0x98
> Jul 25 02:15:02 vega2008 kernel:  [<c045735a>] ? kmap_high+0x19/0x16b
> Jul 25 02:15:02 vega2008 kernel:  [<c04570d2>] ? page_address+0x78/0x98
> Jul 25 02:15:02 vega2008 kernel:  [<c046dced>] ? copy_strings+0x169/0x173
> Jul 25 02:15:02 vega2008 kernel:  [<c046ddad>] ?
> search_binary_handler+0x8f/0x1af
> Jul 25 02:15:02 vega2008 kernel:  [<c046efe7>] ? do_execve+0x133/0x194
> Jul 25 02:15:02 vega2008 kernel:  [<c04030d7>] ? sys_execve+0x2a/0x4a
> Jul 25 02:15:02 vega2008 kernel:  [<c04047aa>] ? syscall_call+0x7/0xb
> Jul 25 02:15:02 vega2008 kernel:  [<c0610000>] ? early_init_intel+0x0/0x3c
> Jul 25 02:15:02 vega2008 kernel:  =======================
> Jul 25 02:15:02 vega2008 kernel: BUG: soft lockup - CPU#7 stuck for
> 61s! [sshd:24188]
> Jul 25 02:15:02 vega2008 kernel:
> Jul 25 02:15:02 vega2008 kernel: Pid: 24188, comm: sshd Not tainted
> (2.6.25.3 #3)
> Jul 25 02:15:02 vega2008 kernel: EIP: 0060:[<c06154f0>] EFLAGS: 00200293 CPU: 7
> Jul 25 02:15:02 vega2008 kernel: EIP is at _spin_lock+0xa/0x15
> Jul 25 02:15:02 vega2008 kernel: EAX: c079349c EBX: f79ec580 ECX:
> ffffffff EDX: 00008381
> Jul 25 02:15:02 vega2008 kernel: ESI: ffffffff EDI: f79ec580 EBP:
> f68a6580 ESP: f1824e50
> Jul 25 02:15:02 vega2008 kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Jul 25 02:15:02 vega2008 kernel: CR0: 8005003b CR2: b7ebf978 CR3:
> 32996000 CR4: 000006f0
> Jul 25 02:15:02 vega2008 kernel: DR0: 00000000 DR1: 00000000 DR2:
> 00000000 DR3: 00000000
> Jul 25 02:15:02 vega2008 kernel: DR6: ffff0ff0 DR7: 00000400
> Jul 25 02:15:02 vega2008 kernel:  [<c04119c7>] ?
> native_flush_tlb_others+0x49/0x9b
> Jul 25 02:15:02 vega2008 kernel:  [<c0411e65>] ? flush_tlb_mm+0x51/0x54
> Jul 25 02:15:02 vega2008 kernel:  [<c045bc58>] ? exit_mmap+0x93/0xc9
> Jul 25 02:15:02 vega2008 kernel:  [<c04214c2>] ? mmput+0x25/0x68
> Jul 25 02:15:02 vega2008 kernel:  [<c046e9c9>] ? flush_old_exec+0x4f8/0x777
> Jul 25 02:15:02 vega2008 kernel:  [<c046dfcf>] ? kernel_read+0x32/0x43
> Jul 25 02:15:02 vega2008 kernel:  [<c0490e60>] ? load_elf_binary+0x359/0x1152
> Jul 25 02:15:02 vega2008 kernel:  [<c045a6ee>] ? get_user_pages+0x2d5/0x35c
> Jul 25 02:15:02 vega2008 kernel:  [<c04570d2>] ? page_address+0x78/0x98
> Jul 25 02:15:02 vega2008 kernel:  [<c045735a>] ? kmap_high+0x19/0x16b
> Jul 25 02:15:02 vega2008 kernel:  [<c04570d2>] ? page_address+0x78/0x98
> Jul 25 02:15:02 vega2008 kernel:  [<c046dced>] ? copy_strings+0x169/0x173
> Jul 25 02:15:02 vega2008 kernel:  [<c046ddad>] ?
> search_binary_handler+0x8f/0x1af
> Jul 25 02:15:02 vega2008 kernel:  [<c046efe7>] ? do_execve+0x133/0x194
> Jul 25 02:15:02 vega2008 kernel:  [<c04030d7>] ? sys_execve+0x2a/0x4a
> Jul 25 02:15:02 vega2008 kernel:  [<c04047aa>] ? syscall_call+0x7/0xb
> Jul 25 02:15:02 vega2008 kernel:  [<c0610000>] ? early_init_intel+0x0/0x3c
> Jul 25 02:15:02 vega2008 kernel:  =======================
> Jul 25 02:15:02 vega2008 kernel: BUG: soft lockup - CPU#1 stuck for
> 61s! [http_cap:12228]
> .............
> -----------------------------------------------------------------------------------
> 
> 
> I choose CentOS because I believe it is the most stable OS in
> commodity machines.
> But I don't know how to do when facing the trouble.
> 
> The difference lies in that I patched a PF_RING patch in original
> kernel and recompiled
> the kernel to run my machines. I wonder whether the patched kernel
> crashes the CentOS
> because PF_RING automatically downloads the kernel codes from
> www.kernel.org, but not
> the one from www.centos.org? Or, the centOS holds the bugs in its distribution?
> 

This is very common, a google search for:

'poweredge 1950' 'BUG: soft lockup' 'stuck'

produces almost 2000 results.  It seems to be something to do with the 
on board network ports.

I do not seen this problem ... has anyone else?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 251 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20080726/b5e20c0c/attachment-0003.sig>