[CentOS] EDAC Kernel Panic 2.6.9-78 and above

Thu Oct 22 08:20:14 UTC 2009
ken <gebser at mousecar.com>

On 10/21/2009 10:21 PM Philip Gwyn wrote:
> On 20-Oct-2009 Michael Schumacher wrote:
>>> I've got a production system running CentOS 4 that was rock solid
>>> until I upgraded from 2.6.9-55 to 2.6.9-78.0.13 (now running
>>> 2.6.9-89.0.11). The system now crashes intermittently after a few
>>> weeks. I finally caught the panic message :
>>> EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
>>> Kernel panic - not syncing: MC0: Uncorrected Error
> 
> I have also seen this message or something very close.  The server is 200 km
> away and the person who read it to me over the phone wasn't very fluent in
> English.
> 
> That server has a ASUS DSBF-D12 motherboard.  Kernel was
> 2.6.9-89.0.11.EL.  The crash could happen within hours or even minutes.
> 
> I downgraded to 2.6.9-55.0.9.EL, which doesn't have the i500_edac module.  Now
> that I have a PDU and remote KVM set up, I'm going to try other kernels
> tomorrow.
> 
> -Philip

When I've upgraded a kernel on CentOS, the previous kernel(s) is/are not
removed and in fact remain part of the boot menu, albeit not then the
kernel(s) booted by default.  E.g.,

cat /boot/grub/menu.lst
...
title CentOS (2.6.18-164.2.1.el5.plus)
        root (hd0,2)
        kernel /vmlinuz-2.6.18-164.2.1.el5.plus ro
root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet
        initrd /initrd-2.6.18-164.2.1.el5.plus.img
title CentOS (2.6.18-164.el5)
        root (hd0,2)
        kernel /vmlinuz-2.6.18-164.el5 ro
root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet
        initrd /initrd-2.6.18-164.el5.img
title CentOS (2.6.18-128.7.1.el5)
        root (hd0,2)
        kernel /vmlinuz-2.6.18-128.7.1.el5 ro
root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet
        initrd /initrd-2.6.18-128.7.1.el5.img
...

If your /boot/grub/menu.lst is similar, then you need only select a
previously installed kernel at the boot menu.  You can access this via
your remote KVM setup, yes?

In the past I've edited menu.lst to change what's booted, i.e., I
rearranged the order of the stanzas to make the first one, which is the
default (the one booted if no action is taken at the boot menu), the
working/desired kernel.

hth,
ken