Uniprocessor kernel booted after YUM update - Discuss

20 Sep 2005


      We have Xeon IA32 dual-processor servers running Centos 3.5 in an HPC 
batch-only compute grid configuration . We have yum update operating 
automatically with default updates being applied weekly. Because of the 
workload pattern of long-runing jobs, the servers tend to stay up 
without a reboot for very long periods.
Recently, yum installed an updated kernel 2.4.21-32.0.1.ELsmp;  when we 
got around to rebooting, we found that some of the machines were running 
the uniprocessor kernel 2.4.21-32.0.1.EL , showing only a single cpu. 
The grub.conf file had been modified in the usual pushdown manner but 
the default kernel had been set at #2 instead of #0.
Bizzarely, some of the systems DID boot the upgraded SMP kernel as expected.
Here is the grub.conf from an affected server:
default=2
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title CentOS (2.4.21-32.0.1.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.0.1.ELsmp ro root=LABEL=/
        initrd /initrd-2.4.21-32.0.1.ELsmp.img
title CentOS (2.4.21-32.0.1.EL)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.0.1.EL ro root=LABEL=/
        initrd /initrd-2.4.21-32.0.1.EL.img
title CentOS-3 (2.4.21-32.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.ELsmp ro root=LABEL=/
        initrd /initrd-2.4.21-32.ELsmp.img
title CentOS-3-up (2.4.21-32.EL)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-32.EL ro root=LABEL=/
        initrd /initrd-2.4.21-32.EL.img
Changing the default back to 0 has no effect, it still boots the 
2.4.21-32.0.1.EL kernel and not the required SMP one. However, if we use 
the interactive GRUB boot menu & select the correct kernel 
interactively, it then boots SMP OK with both processors and all memory 
available.
I tried the obvious ploy of removing the last three kernel entries in 
grub.conf & setting default=0 but it still manages to boot the 
2.4.21-32.0.1.EL UP kernel even though it is no longer in the kernel 
menu list.
We think we will disable automatic yum kernel updates in future , but 
meanwhile, has anyone any suggestions or experiences to share on this 
apart from a complete re-install of each affected node?
Les Oswald