On Tue, Sep 06, 2005 at 12:15:08PM -0500, Johnny Hughes wrote:
On Tue, 2005-09-06 at 10:00 -0700, Mark Elam wrote:
Hey all,
Long time user of Centos, I really love what you guys are doing here. I have a farm of 50 Centos 4.1 machines. (originally 4.0 updated with yum up to current 4.1). Ever since I updated to the 2.6.9-11 kernel I am getting a lot of kernel panics. 7 machines suffered kernel panics over the weekend. Funny that they were the only ones that are booted into the new kernel! The rest haven't been rebooted yet so they are still at the 2.6.9-5 kernel. They all have similar messages in the logs as show below. Any ideas on where to look for the problem? Has anyone else seen this?
Machine info: Typical of all 50 machines:
P4 3Ghz 2gb ram U320 SCSI disk w/ lsi scsi controller Intel Workstation boards Nvidia graphics
Verify that all the boards have the latest BIOS updates (check for one, install the latest bios and see if it corrects the issue).
All machines exactly the same, installed w/ kickstart w/ these packages:
%packages
...
/var/log/messsages:
Sep 3 04:03:10 qu015 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000054 Sep 3 04:03:10 qu015 kernel: printing eip: Sep 3 04:03:10 qu015 kernel: c016c583 Sep 3 04:03:10 qu015 kernel: *pde = 0bb6d001 Sep 3 04:03:10 qu015 kernel: Oops: 0000 [#1] Sep 3 04:03:10 qu015 kernel: SMP Sep 3 04:03:10 qu015 kernel: Modules linked in: nvidia(U) vmnet(U)
^^^^^^^^^^^^^^^^^
vmmon(U) nfs nfsd exportfs lockd sunrpc md5 ipv6 parport_pc lp parport
^^^^^^^^ ...
Can you also test without out the Nvidia and vmware driver? or get the latest one if not already done?
Cheers,
Tru