On 3/24/2011 12:37 PM, Alain Péan wrote: > Le 24/03/2011 16:03, Windsor Dave L. (AdP/TEF7.1) a écrit : >> <snipped> >> Code: 00 00 00 00 00 00 00 00 70 4d 4f 9d 00 81 ff ff 98 e4 4b dc >> RIP [<ffff8100dc435cf0>] >> RSP<ffff81001529fd18> >> CR2: ffff8100dc435cf0 >> <0>Kernel panic - not syncing: Fatal exception >> >> <snipped> >> I am trying to determine if this is pointing to a hardware or software issue. Some of the Google results suggested using a Centosplus kernel - is this a good idea? >> >> The server is a HP DL380 G7 Server with 4 GB RAM (1 DIMM 1333 MHz), one 4-core CPU (2133 MHz), 4 built-in Broadcom "NetExtreme II BCM5709 II Gigabit Ethernet" NICs, and a P410 Smart Array Controller. The P410 and the system BIOS have both been updated to the latest levels to see if that fixes the crashes, with no change. >> >> Any idea where I should look next? >> >> Thanks for any help anyone can provide! >> > > The fact that it appears after two weeks or so reminds me of a bug I > saw on linux PowerEdge mailing list, //the "blocked for more than 120 > seconds" timeout bug. > I don't know if your problem is related, but if it is the case you > should see the message in your logs. > > Do you have any high IO load, at least at some moments, on your server ? > > See : > http://lists.us.dell.com/pipermail/linux-poweredge/2011-March/044515.html > > In this case, using a newer kernel would be indeed it seems a good idea. > > See if it can help... > > Alain > // > -- > ========================================================== > Alain Péan - LPP/CNRS > Administrateur Système/Réseau > Laboratoire de Physique des Plasmas - UMR 7648 > Observatoire de Saint-Maur > 4, av de Neptune, Bat. A > 94100 Saint-Maur des Fossés > Tel : 01-45-11-42-39 - Fax : 01-48-89-44-33 > ========================================================== Alain, Today, there are not high I/O loads. This server was intended to replace two older HP-UX servers. I had just begun to migrate the workload to the new server when the crashes began to occur. There are some minor, sporadic I/O loads but nothing that I would think could trigger the bug discussed in your link. However, I haven't measured the workload closely yet, so there could be spikes. Best Regards, *Dave Windsor* Robert Bosch LLC Team Leader, MES Database Infrastructure Group (AdP/TEF7.1) 4421 Highway 81 North Anderson, SC 29621 USA _www.bosch.us _ <http://www.bosch.us> Tel: 1 (864) 260-8459 Fax: 1 (864) 260-8422 _Dave.Windsor at us.bosch.com_ <mailto:Dave.Windsor at us.bosch.com>