Hi. I just noticed I had a CentOS 5.3 system that I updated to CentOS 5.5 a few days ago, and I just ran "yum -y update" again to get the latest kernel, and I just noticed it still has the old 2.6.18-128 kernel instead of the new 2.6.18-194.17. What gives?
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
Any suggestions?
Thanks, -at
myserver# yum -y update ... myserver# reboot .....
myserver# uname -a Linux hwd-ddc-sonydb-prod 2.6.18-128.4.1.el5 #1 SMP Tue Aug 4 20:19:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux myserver# cat /etc/grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/md1 # initrd /initrd-version.img #boot=/dev/md0 default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.18-194.17.1.el5) root (hd0,0) kernel /vmlinuz-2.6.18-194.17.1.el5 ro root=/dev/md1 initrd /initrd-2.6.18-194.17.1.el5.img title CentOS (2.6.18-194.11.4.el5) root (hd0,0) kernel /vmlinuz-2.6.18-194.11.4.el5 ro root=/dev/md1 initrd /initrd-2.6.18-194.11.4.el5.img title CentOS (2.6.18-128.4.1.el5) root (hd0,0) kernel /vmlinuz-2.6.18-128.4.1.el5 ro root=/dev/md1 initrd /initrd-2.6.18-128.4.1.el5.img myserver# rpm -q kernel kernel-2.6.18-128.4.1.el5 kernel-2.6.18-194.11.4.el5 kernel-2.6.18-194.17.1.el5 myserver#
/snip
just noticed it still has the old 2.6.18-128 kernel instead of the new 2.6.18-194.17. What gives?
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
Any suggestions?
/snip
kernel /vmlinuz-2.6.18-194.17.1.el5 ro root=/dev/md1
Yup, looking at your root, I am guessing your system isn't booting of the /boot you think it is.
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
I am suspecting that you are looking at /etc/grub.conf which is supposed to be a symlink to /boot/grub/grub.conf and somehow this is no longer a symlink, but rather a file of its own, which is not being read.
This is how this should look. (adjusted for screen wrap in a mail program)
# ls -la /etc/grub.conf lrwxrwxrwx 1 root root 22 Sep 27 2008 /etc/grub.conf -> ../boot/grub/grub.conf
What are the contents of your /boot/grub/grub.conf? That is the actual file. What is the default set to there?
Barry
On 10/03/10 9:16 AM, Barry Brimer wrote:
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
I am suspecting that you are looking at /etc/grub.conf which is supposed to be a symlink to /boot/grub/grub.conf and somehow this is no longer a symlink, but rather a file of its own, which is not being read.
This is how this should look. (adjusted for screen wrap in a mail program)
# ls -la /etc/grub.conf lrwxrwxrwx 1 root root 22 Sep 27 2008 /etc/grub.conf -> ../boot/grub/grub.conf
What are the contents of your /boot/grub/grub.conf? That is the actual file. What is the default set to there?
also check what /boot is mounted.... the /etc/grub.conf file you showed thought that root was /dev/md1, is that the case?
Thank you very much for your replies and suggestions!
Turns out I have a broken RAID. I checked the failed out drive by mounting read-only the /boot partition, and it is configured to boot the older kernel version (the one the system actually boots).
Like Phil said, the OS is seeing one thing, and GRUB another.
Questions:
1. How do I fix the array? (How do I put the failed out drive back in? (I hope it is a small failure that the software RAID can recover from, like a few bad blocks or something. Otherwise I am willing to replace the drive.)
# mdadm /dev/md0 --add /dev/sda1 mdadm: Cannot open /dev/sda1: Device or resource busy #
Maybe it's busy because the system really booted off it? Maybe I can edit grub.conf to change hd(0,0) to hd (1,0) and reboot. Where do I do that, in /dev/sda1 or /dev/sdb1? I guess I could do it in both places. What do you think?
Note: I was able to add /dev/sda3 to /dev/md1, and it is resync'ing the array now.
# mdadm /dev/md1 --add /dev/sda3 mdadm: re-added /dev/sda3 #
2. Is there a different configuration I should adopt, so that OS and GRUB agree on what device to boot from? Or is this the price I pay for using software RAID rather than HW RAID?
Data:
The /etc/grub.conf sym link is set up correctly:
lrwxrwxrwx 1 root root 22 Mar 17 2009 /etc/grub.conf -> ../boot/grub/grub.conf
My /boot filesystem lives on a RAID 1 array:
/dev/md0 on /boot type ext3 (rw)
/proc/mdstat shows only /dev/sdb is still in the RAID 1 mirror:
Personalities : [raid1] md0 : active raid1 sdb1[1] 104320 blocks [2/1] [_U]
md1 : active raid1 sdb3[1] 275964480 blocks [2/1] [_U]
unused devices: <none>
For some reason, it does not show "F" for disk failure. I did reboot the system a couple of times, maybe it forgot about the failure. Older logwatch reports do have the F on both arrays.
lshw and "fdisk -l" shows both /dev/sda and /dev/sdb.
so does lsscsi:
[1:0:0:0] disk IBM-ESXS ST3300007LC FN B26B /dev/sda [1:0:1:0] disk IBM-ESXS MAT3300NC FN B414 /dev/sdb [1:0:8:0] process IBM 39M6750a S320 0 1 -
Thanks very much for the help!
Best, Aleksey
On 10/6/2010 2:36 PM, Aleksey Tsalolikhin wrote:
Thank you very much for your replies and suggestions!
Turns out I have a broken RAID. I checked the failed out drive by mounting read-only the /boot partition, and it is configured to boot the older kernel version (the one the system actually boots).
Like Phil said, the OS is seeing one thing, and GRUB another.
Questions:
- How do I fix the array? (How do I put the failed out
drive back in? (I hope it is a small failure that the software RAID can recover from, like a few bad blocks or something. Otherwise I am willing to replace the drive.)
# mdadm /dev/md0 --add /dev/sda1 mdadm: Cannot open /dev/sda1: Device or resource busy #
Maybe it's busy because the system really booted off it?
Do you still have it mounted as you mentioned above? If so, unmount it. If it shows as 'failed' in /proc/mdstat you would have to use mdadm to remove it before adding it back.
On Wed, Oct 6, 2010 at 12:48 PM, Les Mikesell lesmikesell@gmail.com wrote:
On 10/6/2010 2:36 PM, Aleksey Tsalolikhin wrote:
# mdadm /dev/md0 --add /dev/sda1 mdadm: Cannot open /dev/sda1: Device or resource busy #
Do you still have it mounted as you mentioned above? If so, unmount it.
*Facepalm*
That was it. Thank you. /dev/sda1 is back in /dev/md0 and reconstruction is in process.
If it shows as 'failed' in /proc/mdstat you would have to use mdadm to remove it before adding it back.
Ah! Got it, thanks, Les!
Aleksey
Aleksey Tsalolikhin wrote on 10/03/2010 01:16 AM:
Hi. I just noticed I had a CentOS 5.3 system that I updated to CentOS 5.5 a few days ago, and I just ran "yum -y update" again to get the latest kernel, and I just noticed it still has the old 2.6.18-128 kernel instead of the new 2.6.18-194.17. What gives?
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
Any suggestions?
See what "cat /proc/mdstat" shows. I suspect your RAID is degraded. The OS is seeing one member, and GRUB sees the other disk that is not currently being mirrored.
Phil
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Aleksey Tsalolikhin Sent: Sunday, October 03, 2010 1:17 AM To: CentOS mailing list Subject: [CentOS] system "stuck" with 2.6.18-128 kernel. how to move to2.6.18-194.17?
Hi. I just noticed I had a CentOS 5.3 system that I updated to CentOS 5.5 a few days ago, and I just ran "yum -y update" again to get the latest kernel, and I just noticed it still has the old 2.6.18-128 kernel instead of the new 2.6.18-194.17. What gives?
/etc/grub.conf points at 2.6.18-194.17, but when I reboot, 2.6.18-128 comes up.
Any suggestions?
Thanks, -at
myserver# yum -y update ... myserver# reboot .....
myserver# uname -a Linux hwd-ddc-sonydb-prod 2.6.18-128.4.1.el5 #1 SMP Tue Aug 4 20:19:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux myserver# cat /etc/grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/md1 # initrd /initrd-version.img #boot=/dev/md0 default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.18-194.17.1.el5)
Caveat: I am not a grub guru, and don't play one on TV. This is the only "title" line in your grub.conf, so it's the only entry in your grub.conf as far as grub cares.
root (hd0,0) kernel /vmlinuz-2.6.18-194.17.1.el5 ro root=/dev/md1 initrd /initrd-2.6.18-194.17.1.el5.img title CentOS
(2.6.18-194.11.4.el5) root (hd0,0) kernel /vmlinuz-2.6.18-194.11.4.el5 ro root=/dev/md1 initrd /initrd-2.6.18-194.11.4.el5.img title CentOS (2.6.18-128.4.1.el5) root (hd0,0) kernel /vmlinuz-2.6.18-128.4.1.el5 ro root=/dev/md1 initrd /initrd-2.6.18-128.4.1.el5.img myserver# rpm -q kernel
This is the last kernel line, so it's the one that grub acts on. Result: You boot 2.6.18-128.4.1.el5
Method to test: check your menu screen at boot time, see if there is only 1 entry; put 'title' in front of '(2.6.18-194.11.4.el5)' and '(2.6.18-128.4.1.el5)' in grub.conf, reboot, and see if you now have 3 entries in the grub boot menu screen; see if putting in the two title lines causes 'default=0' to boot 2.6.18-194.17.1.el5
/blather
kernel-2.6.18-128.4.1.el5 kernel-2.6.18-194.11.4.el5 kernel-2.6.18-194.17.1.el5 myserver# _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
******************************************************************* This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept for the presence of computer viruses. www.Hubbell.com - Hubbell Incorporated**