On 10/21/2009 10:21 PM Philip Gwyn wrote:
On 20-Oct-2009 Michael Schumacher wrote:
I've got a production system running CentOS 4 that was rock solid until I upgraded from 2.6.9-55 to 2.6.9-78.0.13 (now running 2.6.9-89.0.11). The system now crashes intermittently after a few weeks. I finally caught the panic message : EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4) Kernel panic - not syncing: MC0: Uncorrected Error
I have also seen this message or something very close. The server is 200 km away and the person who read it to me over the phone wasn't very fluent in English.
That server has a ASUS DSBF-D12 motherboard. Kernel was 2.6.9-89.0.11.EL. The crash could happen within hours or even minutes.
I downgraded to 2.6.9-55.0.9.EL, which doesn't have the i500_edac module. Now that I have a PDU and remote KVM set up, I'm going to try other kernels tomorrow.
-Philip
When I've upgraded a kernel on CentOS, the previous kernel(s) is/are not removed and in fact remain part of the boot menu, albeit not then the kernel(s) booted by default. E.g.,
cat /boot/grub/menu.lst ... title CentOS (2.6.18-164.2.1.el5.plus) root (hd0,2) kernel /vmlinuz-2.6.18-164.2.1.el5.plus ro root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet initrd /initrd-2.6.18-164.2.1.el5.plus.img title CentOS (2.6.18-164.el5) root (hd0,2) kernel /vmlinuz-2.6.18-164.el5 ro root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet initrd /initrd-2.6.18-164.el5.img title CentOS (2.6.18-128.7.1.el5) root (hd0,2) kernel /vmlinuz-2.6.18-128.7.1.el5 ro root=/dev/mapper/luks-3d723b4f-0184-438d-9cb9-9ebff16e683a rhgb quiet initrd /initrd-2.6.18-128.7.1.el5.img ...
If your /boot/grub/menu.lst is similar, then you need only select a previously installed kernel at the boot menu. You can access this via your remote KVM setup, yes?
In the past I've edited menu.lst to change what's booted, i.e., I rearranged the order of the stanzas to make the first one, which is the default (the one booted if no action is taken at the boot menu), the working/desired kernel.
hth, ken