[CentOS] Kernel Panic on HP/Compaq ProLiant G7

Thu Mar 24 20:56:12 UTC 2011
Windsor Dave L. (AdP/TEF7) <Dave.Windsor at us.bosch.com>


On 3/24/2011 4:38 PM, Dr. Ed Morbius wrote:
> Dave:
>
> on 16:03 Thu 24 Mar, Windsor Dave L. (AdP/TEF7.1) (Dave.Windsor at us.bosch.com) wrote:
>> Hello Everyone,
>>
>> Code: 00 00 00 00 00 00 00 00 70 4d 4f 9d 00 81 ff ff 98 e4 4b dc
>> RIP  [<ffff8100dc435cf0>]
>>   RSP<ffff81001529fd18>
>> CR2: ffff8100dc435cf0
>>   <0>Kernel panic - not syncing: Fatal exception
>>
>>
>> This suggests that something happened in a Samba process.
>
> Correct.
>
> If this is regularly happening in Samba, that would point to a problem
> with your samba config (either on that host, something remotely stuffing
> bad packets at you, or likley in that case, both, as bad data shouldn't
> crash the host).

I can have have network analyst monitor the ports for unusual bursts of 
traffic, although that might not catch small amounts of strange data.
>
> If this is happening in different programs over time, then the problem
> is likely /not/ software, but hardware/firmware.
>
> The LKML may be able to help you on your panic; please read their bug
> posting guidelines /BEFORE/ posting.
>
>> I have the Samba3x packages installed since we are beginning to
>> introduce Win7 clients into our environment.
>
> What happens if you take the Win7 clients away?
>
>> Googling "Kernel panic - not syncing: Fatal exception" and "CentOS"
>
> That is the generic kernel panic message.  It's going to be
> spectacularly unspecific.
>
>> produced many hits, but nothing that seemed to exactly match my
>> problem.  Since this is the only G7 server I have here right now, I
>> can't reproduce the problem on another machine.  The G6s I have
>> running the identical version of CentOS have no problems.
>>
>> I am trying to determine if this is pointing to a hardware or software
>> issue.  Some of the Google results suggested using a Centosplus kernel
>> - is this a good idea?
>
> Dell have had numerous issues with recent server editions, it's possible
> HP are as well:
>
>   - If you haven't, configure the netconsole kernel module for
>     kernel-enabled network logging of panics.
       This is a great idea.  I will work on that soonest.
>
>   - Call HP and find out what the latest recommended BIOS and firmware
>     upgrades for your system are.  C-STATE has been a particular issue
>     with Dell, and its' been disabled entirely in recent BIOS versions.
>     I see below you've updated BIOS.
>
>   - Scan logs for other messages, particularly panics and/or ECC issues.
       I haven't seen anything ominous, although I have noticed a long 
time gap between the last entry in /var/log/messages and the actual 
crash.  Such a gap in entries is very unusual.
>
>   - If you can stand the downtime, run memtest86+ at least overnight on
>     your RAM.  A reboot indicates a failed test.
>
>   - Otherwise: try running with half your RAM swapped.
>
>   - Check/reseat all DIMMs, sockets, and cables.  Some folks caution
>     against this on the basis of connector wear, but if you've got a
>     problem, this may help resolve it, and I've seen boxes shipped with
>     components poorly or even un-cabled.
       We have one DIMM of 4 GB RAM, so I can't swap it out or run with 
half.  I have reseated it and inspected the contacts, and it looks OK. 
I will look at anything else with connectors.

>
>   - Does a similarly equipped system exhibit the same problems?
>
>> The server is a HP DL380 G7 Server with 4 GB RAM (1 DIMM 1333 MHz),
>> one 4-core CPU (2133 MHz), 4 built-in Broadcom "NetExtreme II BCM5709
>> II Gigabit Ethernet" NICs, and a P410 Smart Array Controller.  The
>> P410 and the system BIOS have both been updated to the latest levels
>> to see if that fixes the crashes, with no change.
>
> Ugh.  Broadcom's gotten better but I prefer Intel NICs.  Can't speak to
> the others.  And OK, you've updated BIOS.
>

Thanks for your help!


Best Regards,

Dave Windsor

Robert Bosch LLC
Team Leader, MES Database Infrastructure Group (AdP/TEF7.1)
4421 Highway 81 North
Anderson, SC 29621 USA
www.bosch.us

Tel: 1 (864) 260-8459
Fax: 1 (864) 260-8422
Dave.Windsor at us.bosch.com