[CentOS] EDAC Kernel Panic 2.6.9-78 and above

Tue Oct 20 06:30:55 UTC 2009
Michael Schumacher <michael.schumacher at pamas.de>

 Chris,

> I've got a production system running CentOS 4 that was rock solid
> until I upgraded from 2.6.9-55 to 2.6.9-78.0.13 (now running
> 2.6.9-89.0.11). The system now crashes intermittently after a few
> weeks. I finally caught the panic message :

> EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
> Kernel panic - not syncing: MC0: Uncorrected Error

> Looking at the kernel changelog, I see that EDAC support was added
> for the Intel 5000 chipset in 2.6.9-68.20.EL which this server runs.

Same issue here with a machine running centos 5.3. The problem began
with a kernel update that introduced the 5000 chipset. See the thread
"RAM errors after kernel-update" for more details. I couldn't solve
the problem yet, but because the machine crashes every two days with
this kernel, I had to boot an earlier kernel without chipset support.


> I'm trying to determine if this is a potential memory issue, or is
> this related to some other hardware item. Also considering disabling
> EDAC in the kernel (is "noedac" a valid option?) as a last resort. I
> will run memtest86+ on the server as soon as possible to check the
> memory, just formulating my game plan if it's something else.

Don't use the memtest86+ version that comes with the centos ISO. There is
a much newer version available from the authors website. Only the new
version identifies the chipset correctly.
-- 
Mit freundlichen Grüßen
Michael Schumacher
mailto:michael.schumacher at pamas.de