[CentOS] Uniprocessor kernel booted after YUM update

Tue Sep 20 15:29:02 UTC 2005
Dr R L Oswald <L.Oswald at cranfield.ac.uk>

We have Xeon IA32 dual-processor servers running Centos 3.5 in an HPC 
batch-only compute grid configuration . We have yum update operating 
automatically with default updates being applied weekly. Because of the 
workload pattern of long-runing jobs, the servers tend to stay up 
without a reboot for very long periods.

Recently, yum installed an updated kernel 2.4.21-32.0.1.ELsmp;  when we 
got around to rebooting, we found that some of the machines were running 
the uniprocessor kernel 2.4.21-32.0.1.EL , showing only a single cpu. 
The grub.conf file had been modified in the usual pushdown manner but 
the default kernel had been set at #2 instead of #0.

Bizzarely, some of the systems DID boot the upgraded SMP kernel as expected.

Here is the grub.conf from an affected server:

default=2
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title CentOS (2.4.21-32.0.1.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.0.1.ELsmp ro root=LABEL=/
        initrd /initrd-2.4.21-32.0.1.ELsmp.img
title CentOS (2.4.21-32.0.1.EL)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.0.1.EL ro root=LABEL=/
        initrd /initrd-2.4.21-32.0.1.EL.img
title CentOS-3 (2.4.21-32.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.ELsmp ro root=LABEL=/
        initrd /initrd-2.4.21-32.ELsmp.img
title CentOS-3-up (2.4.21-32.EL)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.EL ro root=LABEL=/
        initrd /initrd-2.4.21-32.EL.img

Changing the default back to 0 has no effect, it still boots the 
2.4.21-32.0.1.EL kernel and not the required SMP one. However, if we use 
the interactive GRUB boot menu & select the correct kernel 
interactively, it then boots SMP OK with both processors and all memory 
available.

I tried the obvious ploy of removing the last three kernel entries in 
grub.conf & setting default=0 but it still manages to boot the 
2.4.21-32.0.1.EL UP kernel even though it is no longer in the kernel 
menu list.

We think we will disable automatic yum kernel updates in future , but 
meanwhile, has anyone any suggestions or experiences to share on this 
apart from a complete re-install of each affected node?

Les Oswald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: L.Oswald.vcf
Type: text/x-vcard
Size: 354 bytes
Desc: not available
URL: <http://lists.centos.org/pipermail/centos/attachments/20050920/f66d8a94/attachment-0004.vcf>