So I have an issue with CentOS 5.3 i386, LVM, and SATA.
Boot device is a 200GB ATA disk on hda2.
I've added a couple of disks with the on-the-mobo SATA controller ports and grown the EXT3 fs with system-config-lvm.
Then, as an experiment, I added a PCI SATA controller and an additional disk. Ran system-config-lvm, added the new space to the existing VolGroup00, and all was good.
Feeling confident, I shut the box down and plugged another disk in to the PCI SATA controller. On reboot, I was greeted with a kernel panic since the OS could not find "really-long-label." Removing the new SATA disk did not fix the issue.
Digging about a bit with the box booted from CD in rescue mode, it appears that the LVM filesystem is intact, but the BIOS device letters (sda, sdb, sdc, etc) are scrambled in the LVM configuration somehow. After running "lvm pv<something>" I figured out that the BIOS drive letters currently assigned don't match what's in /etc/lvm/backup, archive, and so on.
Obviously there's some way to read the VolGroup info from the disks, since "linux rescue" can mount the volume VolGroup00. I've tired vgrestore, but I'm still getting a kernel panic on boot from disk.
What's the best way to get the LVM/BIOS config sorted out so that the system can boot up?
Thanks for any hints.
--Chris
Chris Boyd wrote:
So I have an issue with CentOS 5.3 i386, LVM, and SATA.
Boot device is a 200GB ATA disk on hda2.
I've added a couple of disks with the on-the-mobo SATA controller ports and grown the EXT3 fs with system-config-lvm.
Then, as an experiment, I added a PCI SATA controller and an additional disk. Ran system-config-lvm, added the new space to the existing VolGroup00, and all was good.
Feeling confident, I shut the box down and plugged another disk in to the PCI SATA controller. On reboot, I was greeted with a kernel panic since the OS could not find "really-long-label." Removing the new SATA disk did not fix the issue.
My guess would be that the driver for the new PCI card isn't part of the initrd. If you don't have any data on the drive yet you could always reduce the vg so that it is no longer included. You could also recreate your initrd to include the driver for the pci card.
-Shad
On Tue, 2009-05-12 at 09:47 -0600, Shad L. Lords wrote:
Chris Boyd wrote:
So I have an issue with CentOS 5.3 i386, LVM, and SATA.
Boot device is a 200GB ATA disk on hda2.
I've added a couple of disks with the on-the-mobo SATA controller ports and grown the EXT3 fs with system-config-lvm.
Then, as an experiment, I added a PCI SATA controller and an additional disk. Ran system-config-lvm, added the new space to the existing VolGroup00, and all was good.
Feeling confident, I shut the box down and plugged another disk in to the PCI SATA controller. On reboot, I was greeted with a kernel panic since the OS could not find "really-long-label." Removing the new SATA disk did not fix the issue.
My guess would be that the driver for the new PCI card isn't part of the initrd. If you don't have any data on the drive yet you could always reduce the vg so that it is no longer included. You could also recreate your initrd to include the driver for the pci card.
Alternately, since the BIOS re-assigned the drive letters, the mkinitrd still contains ignore-lock-failure for only the originally installed disk. I've successfully handled this (in test and live) by extracting the mkinitrd (cpio format, IIRC), changing the init to ignore lock failures on the new disk and making a new initrd. Then it worked. However, this was for a fall-back for if the first drive got trashed, the back-up disk had an LV with a slightly modified name and would be seen as the new sda.
For the OP's situation, might need to search a little further to get to the same results. But I think it's surely something in the initrd, even if the driver is the same.
-Shad
<snip sig stuff>
On May 12, 2009, at 11:28 AM, William L. Maltby wrote:
For the OP's situation, might need to search a little further to get to the same results. But I think it's surely something in the initrd, even if the driver is the same.
OK, so two hits on initrd, I'll go and read up on that.
Just to be clear, it could still be an initrd issue even if the card was working with one drive attached?
--Chris
At Tue, 12 May 2009 11:43:54 -0500 CentOS mailing list centos@centos.org wrote:
On May 12, 2009, at 11:28 AM, William L. Maltby wrote:
For the OP's situation, might need to search a little further to get to the same results. But I think it's surely something in the initrd, even if the driver is the same.
OK, so two hits on initrd, I'll go and read up on that.
Just to be clear, it could still be an initrd issue even if the card was working with one drive attached?
Yes. You 'got away' with it when you set it up because the driver modules, etc. where loaded into the running system during the set it process. When you rebooted, the driver modules, etc. were not loaded or other aspects of the run time environment were not set up. Updating the initrd would make sure that the run time environment is properly initialized early in the boot up process, whether this means loading driver modules or performing other sorts of set up procedues (such as scanning for LVM volumes).
--Chris
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Tue, 2009-05-12 at 11:43 -0500, Chris Boyd wrote:
On May 12, 2009, at 11:28 AM, William L. Maltby wrote:
For the OP's situation, might need to search a little further to get to the same results. But I think it's surely something in the initrd, even if the driver is the same.
OK, so two hits on initrd, I'll go and read up on that.
Just to be clear, it could still be an initrd issue even if the card was working with one drive attached?
IMO, yes (maybe). When the Initial install is done, I think there is some stuff that is needed in the initrd to find the disk so root can be mounted. I'm not sure which initrd file contains it, but I think it's got to be there somewhere. There are some more considerations I had forgotten.
I'm not that familiar with the (relatively) new device manager stuff, but I would expect that some reference to the equivalent of sda (or somesuch - maybe a device manager specific construct?) will be in some files(s).
There is another possibility? Grub installs a stage2 (IIRC) file that has specific device in it. That's probably your next point of failure (I don't recall what your original failure mode was).
A brief reminder: when BIOS picks another disk to boot, or finds another HD and reshuffles, it "rotates" the drive assignments. Depending on where it inserted the new HD (let's assume the new one was inserted as the "first" drive), what was 0x80 becomes 0x81, former 0x81 becomes 0x82, etc. These equate to {s/h}da, db, etc.
So, you'll probably have a two-step adjustment. The mkinitrd (might be nothing in there needing changing because it may work off the grub passed stuff - I don't recall) and a re-install of grub to get the new stage2 (IIRC). Look also at the /boot/grub/menu.lst - might need something in there.
Last, you may need to look at /etc/fstab.
I'm sorry I can't be more specific - it's been too long since I worked with this stuff.
Best I can do w/o actual reading/experimenting is clues.'
--Chris
<snip sig stuff>
HTH
On Tue, 2009-05-12 at 13:32 -0400, William L. Maltby wrote:
<snip>
There is another possibility? Grub installs a stage2 (IIRC) file that has specific device in it. That's probably your next point of failure (I don't recall what your original failure mode was).
<snip>
P.S. Don't forget to take advantage of the grub edit and search capability. That'll help determine the settings for menu.lst like hd(0,0), hd(1,0), etc.
"info grub" for the details - two brain cells are occupado.
HTH
On May 12, 2009, at 12:32 PM, William L. Maltby wrote:
There is another possibility? Grub installs a stage2 (IIRC) file that has specific device in it. That's probably your next point of failure (I don't recall what your original failure mode was).
The system boots, grub menu starts, counts down and then starts booting the next stage. Then the kernel panics after sitting at "Red Hat Nash" for a bit. Apparently it can't find one of the LVM disks and so can't mount root.
--Chris
On May 12, 2009, at 12:32 PM, William L. Maltby wrote:
IMO, yes (maybe). When the Initial install is done, I think there is some stuff that is needed in the initrd to find the disk so root can be mounted. I'm not sure which initrd file contains it, but I think it's got to be there somewhere. There are some more considerations I had forgotten.
Thanks, this gave the the hint I needed.
Looking at the original modprobe.conf file, there was no entry in it for the driver for the second SATA controller.
So to recover I booted the install CD in "linux rescue" mode.
Ran "chroot /mnt/sysimage"
I added
alias scsi_hostadapter1 sata_promise
to /etc/modprobe.conf
Changed to the /boot directory
mv initrd-2.6.18-128.1.6.el5.img initrd-2.6.18-128.1.6.el5.old
mkinitrd initrd-2.6.18-128.1.6.el5.img 2.6.18-128.1.6.el5
After that completed, I rebooted and the system came up clean--no fsck requested or any other oddities, system-config-lvm seems to be working fine.
No mucking about in /etc/fstab was needed.
--Chris
On Tue, 2009-05-12 at 16:19 -0500, Chris Boyd wrote:
On May 12, 2009, at 12:32 PM, William L. Maltby wrote:
IMO, yes (maybe). When the Initial install is done, I think there is some stuff that is needed in the initrd to find the disk so root can be mounted. I'm not sure which initrd file contains it, but I think it's got to be there somewhere. There are some more considerations I had forgotten.
Thanks, this gave the the hint I needed.
Looking at the original modprobe.conf file, there was no entry in it for the driver for the second SATA controller.
So to recover I booted the install CD in "linux rescue" mode.
Ran "chroot /mnt/sysimage"
I added
alias scsi_hostadapter1 sata_promise
to /etc/modprobe.conf
Changed to the /boot directory
mv initrd-2.6.18-128.1.6.el5.img initrd-2.6.18-128.1.6.el5.old
mkinitrd initrd-2.6.18-128.1.6.el5.img 2.6.18-128.1.6.el5
After that completed, I rebooted and the system came up clean--no fsck requested or any other oddities, system-config-lvm seems to be working fine.
No mucking about in /etc/fstab was needed.
Glad all worked out. I now remember why I had to do fstab. I had some duplicated backups on the hard drives that would be used if the primary drive failed. During testing booting from the second drive, I needed VolGroup00 to be VolGroupAA and other similar changes. This is what also required changes in the init file in the initrd. There is an --ignorelockingfailure imperitive that references the real root. For fallback testing, it needed to reference the fallback (VolGroupAA) volume.
--Chris
<snip sig stuff>