[CentOS] Kernel updates do not boot - always boots oldest kernel

Wed Mar 15 06:46:01 UTC 2023
Simon Matter <simon.matter at invoca.ch>

> Here is the contents of the entire
>
> cat /etc/default.grub
>
> GRUB_TIMEOUT=5
> GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
> GRUB_DEFAULT=0
> GRUB_DISABLE_SUBMENU=true
> GRUB_TERMINAL_OUTPUT="console"
> GRUB_CMDLINE_LINUX="crashkernel=auto
> rd.md.uuid=066ffecb:69137a0b:4e579b4f:dfbf1696
> rd.md.uuid=bd87f682:e6df10e2:d2a6e247:834133f7 rhgb quiet"
> GRUB_DISABLE_RECOVERY="true"
>
> I have only changed GRUB_DEFAULT from "saved" to "0"
>
> I have also run
>
> /usr/sbin/grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

I may be wrong here but IIRC, using grub2-mkconfig as described in the
Grub docs didn't work for me when I tried to use it years ago.

I think you have to find out what is done when installing kernels and try
to find out where it goes wrong in your case. When you look at 'rpm -q
--scripts kernel' you can see that new kernels are registered with the
script '/usr/sbin/new-kernel-pkg'. I suggest to analyze what it does
exactly. I think it calls 'grubby' to do further work...

Regards,
Simon

>
> and seen the grub.cfg and grubenv updated in /boot/efi/EFI/centos
>
> At this point I think I have grub doing its stuff in the correct folder
> / destination used by UEFI for booting.
>
> When I look at grub.cfg there is some stuff I cannot understand
>
> there are five menuentry in this file, like:
>
> menuentry 'CentOS Linux (3.10.0-1160.88.1.el7.x86_64) 7 (Core)' --class
> centos --class gnu-linux --class gnu --class os --unrestricted
> $menuentry_id_option
> 'gnulinux-3.10.0-1160.81.1.el7.x86_64-advanced-7276336b-d2f2-4b94-b491-ad8c5662acb3'
> {
>      load_video
>      set gfxpayload=keep
>      insmod gzio
>      insmod part_gpt
>      insmod part_gpt
>      insmod diskfilter
>      insmod mdraid1x
>      insmod xfs
>      set root='mduuid/bd87f682e6df10e2d2a6e247834133f7'
>      if [ x$feature_platform_search_hint = xy ]; then
>        search --no-floppy --fs-uuid --set=root
> --hint='mduuid/bd87f682e6df10e2d2a6e247834133f7'
> f12be7f3-a6c6-4b90-8c51-286c32d11d12
>      else
>        search --no-floppy --fs-uuid --set=root
> f12be7f3-a6c6-4b90-8c51-286c32d11d12
>      fi
>      linuxefi /vmlinuz-3.10.0-1160.88.1.el7.x86_64
> root=UUID=7276336b-d2f2-4b94-b491-ad8c5662acb3 ro crashkernel=auto
> rd.md.uuid=066ffecb:69137a0b:4e579b4f:dfbf1696
> rd.md.uuid=bd87f682:e6df10e2:d2a6e247:834133f7 rhgb quiet LANG=en_US.UTF-8
>      initrdefi /initramfs-3.10.0-1160.88.1.el7.x86_64.img
> }
>
> the above is the latest kernel - doesn't boot as the console tells me it
> cannot load the vmlinuz file
>
> the kernel that boots looks like:
>
> menuentry 'CentOS Linux (3.10.0-1160.36.2.el7.x86_64) 7 (Core)' --class
> centos --class gnu-linux --class gnu --class os --unrestricted
> $menuentry_id_option
> 'gnulinux-3.10.0-1160.36.2.el7.x86_64-advanced-7276336b-d2f2-4b94-b491-ad8c5662acb3'
> {
>      load_video
>      set gfxpayload=keep
>      insmod gzio
>      insmod part_gpt
>      insmod part_gpt
>      insmod diskfilter
>      insmod mdraid1x
>      insmod xfs
>      set root='mduuid/bd87f682e6df10e2d2a6e247834133f7'
>      if [ x$feature_platform_search_hint = xy ]; then
>        search --no-floppy --fs-uuid --set=root
> --hint='mduuid/bd87f682e6df10e2d2a6e247834133f7'
> f12be7f3-a6c6-4b90-8c51-286c32d11d12
>      else
>        search --no-floppy --fs-uuid --set=root
> f12be7f3-a6c6-4b90-8c51-286c32d11d12
>      fi
>      linuxefi /vmlinuz-3.10.0-1160.36.2.el7.x86_64
> root=UUID=7276336b-d2f2-4b94-b491-ad8c5662acb3 ro crashkernel=auto
> rd.md.uuid=066ffecb:69137a0b:4e579b4f:dfbf1696
> rd.md.uuid=bd87f682:e6df10e2:d2a6e247:834133f7 rhgb quiet
>      initrdefi /initramfs-3.10.0-1160.36.2.el7.x86_64.img
> }
>
> I see that the first line names the kernel in brackets (correctly) but
> the $menuentry_id_option '.....' doesn't make sense to me.
>
> For the kernel that boots (3.10.0-1160.36.2) the entry is
> 'gnulinux-3.10.0-1160.36.2.el7.x86_64-advanced-7276336b-d2f2-4b94-b491-ad8c5662acb3'
>
> For kernels that don't boot, e.g (3.10.0-1160.88.1) we see
>
> 'gnulinux-3.10.0-1160.81.1.el7.x86_64-advanced-7276336b-d2f2-4b94-b491-ad8c5662acb3'
>
> and this entry just seems wrong
>
> firstly the kernel version doesn't match - it has been set to ... 81.1
> ... rather than 88.1
>
> secondly the last part of the line is the same for every menuentry, namely
>
> -advanced-7276336b-d2f2-4b94-b491-ad8c5662acb3
>
> where does this come from? what is this part for? doing?
>
> Thanks
> Rob
>
>
> On 15/03/23 05:05, Leon Fauster via CentOS wrote:
>> Am 14.03.23 um 12:30 schrieb Rob Kampen:
>>> OK,
>>>
>>> found out the problem as to why it doesn't boot any kernel except 36.2
>>>
>>> the system reports that it cannot find
>>>
>>> vmlinuz-3.10.0-1160.88.1.el7.x86_64
>>>
>>> or any one of the others, except for
>>> vmlinuz-3.10.0-1160.36.2.el7.x86_64
>>>
>>> hence a manual selection from the grub menu when in front of the
>>> machine will only load the 36.2 kernel
>>>
>>> I found that under /boot/grub2 there were two .rpmnew files that
>>> mucked up the symbolic link to the grubenv file - so fixed that and
>>> did a reinstall of the latest kernel.
>>>
>>> Now all the grub and efi files appear to update correctly - progress.
>>>
>>> Now just need to work out why the efi boot process can see the old
>>> (original) kernel (36.2) but none of the later ones.
>>>
>>> Any ideas of where to look for this? seems a much more fundamental
>>> problem related to kernel install and efi booting
>>
>>
>> Whats the _complete_ output of cat /etc/default/grub  ?
>>
>> --
>> Leon
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> https://lists.centos.org/mailman/listinfo/centos
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
>