[CentOS] Kernel Panic!

Tue Sep 6 17:00:29 UTC 2005
Mark Elam <melam at mobilygen.com>

Hey all,

Long time user of Centos, I really love what you guys are doing here.  I
have a farm of 50 Centos 4.1 machines. (originally 4.0 updated with yum
up to current 4.1).  Ever since I updated to the 2.6.9-11 kernel I am
getting a lot of kernel panics.  7 machines suffered kernel panics over
the weekend.  Funny that they were the only ones that are booted into
the new kernel!  The rest haven't been rebooted yet so they are still at
the 2.6.9-5 kernel.  They all have similar messages in the logs as show
below.  Any ideas on where to look for the problem?  Has anyone else
seen this?  

Machine info:  Typical of all 50 machines:

P4 3Ghz
2gb ram
U320 SCSI disk w/ lsi scsi controller
Intel Workstation boards
Nvidia graphics

All machines exactly the same, installed w/ kickstart w/ these packages:

%packages
@ office
@ legacy-software-development
@ editors
@ system-tools
@ base-x
@ gnome-software-development
@ graphics
@ smb-server
@ development-tools
@ printing
@ text-internet
@ kde-software-development
@ kde-desktop
@ x-software-development
@ mail-server
@ legacy-network-server
@ sound-and-video
@ gnome-desktop
@ ftp-server
@ network-server
@ graphical-internet
vnc
telnet-server
rsh-server
kernel-smp
grub
kernel
xemacs
rusers-server


/var/log/messsages:

Sep  3 04:03:10 qu015 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000054
Sep  3 04:03:10 qu015 kernel:  printing eip:
Sep  3 04:03:10 qu015 kernel: c016c583
Sep  3 04:03:10 qu015 kernel: *pde = 0bb6d001
Sep  3 04:03:10 qu015 kernel: Oops: 0000 [#1]
Sep  3 04:03:10 qu015 kernel: SMP
Sep  3 04:03:10 qu015 kernel: Modules linked in: nvidia(U) vmnet(U)
vmmon(U) nfs nfsd exportfs lockd sunrpc md5 ipv6 parport_pc lp parport
autofs4 sr_mod ide_scsi dm_mod button battery ac joydev uhci_hcd
ehci_hcd snd_maestro3 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd_page_alloc snd soundcore e1000 floppy ext3 jbd mptscsih
mptbase sd_mod scsi_mod
Sep  3 04:03:10 qu015 kernel: CPU:    1
Sep  3 04:03:10 qu015 kernel: EIP:    0060:[<c016c583>]    Tainted: P
VLI
Sep  3 04:03:10 qu015 kernel: EFLAGS: 00010202   (2.6.9-11.ELsmp)
Sep  3 04:03:10 qu015 kernel: EIP is at iput+0x25/0x61
Sep  3 04:03:10 qu015 kernel: eax: 00000040   ebx: f3a85b74   ecx:
f8c7bbb9   edx: f3a85b74
Sep  3 04:03:10 qu015 kernel: esi: f222b89c   edi: f222b8a4   ebp:
0000006b   esp: f7ceeeec
Sep  3 04:03:10 qu015 kernel: ds: 007b   es: 007b   ss: 0068
Sep  3 04:03:10 qu015 kernel: Process kswapd0 (pid: 43,
threadinfo=f7cee000 task=f7d1b7b0)
Sep  3 04:03:10 qu015 kernel: Stack: f3a85b74 c016a1d8 00000000 00000092
00000000 f7ffe9e0 c016a553 c0144e2c
Sep  3 04:03:10 qu015 kernel:        00d70a00 00000000 00000061 00000000
00023313 000000d0 00000020 c031ad80
Sep  3 04:03:10 qu015 kernel:        00000002 c031ad80 0000000c c01460b8
c02c5604 00023313 f7ceef9c 00000000
Sep  3 04:03:10 qu015 kernel: Call Trace:
Sep  3 04:03:10 qu015 kernel:  [<c016a1d8>] prune_dcache+0x13f/0x18e
Sep  3 04:03:10 qu015 kernel:  [<c016a553>] shrink_dcache_memory
+0x14/0x2b
Sep  3 04:03:10 qu015 kernel:  [<c0144e2c>] shrink_slab+0xf8/0x161
Sep  3 04:03:10 qu015 kernel:  [<c01460b8>] balance_pgdat+0x1d2/0x2f8
Sep  3 04:03:10 qu015 kernel:  [<c02c5604>] schedule+0x844/0x87a
Sep  3 04:03:10 qu015 kernel:  [<c01462a8>] kswapd+0xca/0xcc
Sep  3 04:03:10 qu015 kernel:  [<c011f6ee>] autoremove_wake_function
+0x0/0x2d
Sep  3 04:03:10 qu015 kernel:  [<c02c7296>] ret_from_fork+0x6/0x14
Sep  3 04:03:10 qu015 kernel:  [<c011f6ee>] autoremove_wake_function
+0x0/0x2d
Sep  3 04:03:10 qu015 kernel:  [<c01461de>] kswapd+0x0/0xcc
Sep  3 04:03:10 qu015 kernel:  [<c01041f1>] kernel_thread_helper+0x5/0xb
Sep  3 04:03:10 qu015 kernel: Code: ff e9 fa fe ff ff 53 85 c0 89 c3 74
58 83 bb 3c 01 00 00 20 8b 80 a4 00 00 00 8b 40 24 75 08 0f 0b 4c 04 3d
e7 2d c0 85 c0 74 0b <8b> 50 14 85 d2 74 04 89 d8 ff d2 8d 43 1c ba 70
fc 31 c0 e8 59
Sep  3 04:03:10 qu015 kernel:  <0>Fatal exception: panic in 5 seconds

Thanks!  

Mark Elam