[CentOS] raid1 custom initrd and yum

Wed Apr 2 02:53:08 UTC 2008

On Tuesday 01 April 2008 15:03, Les Mikesell wrote:
> > maybe that is why the system won't boot anymore after I synced the root
> > partition? ;)
> >
> > I hope...
>
> If you are booting a kernel that can't find your root partition, the
> initrd might be the problem.   There are several other things that also
> have to be right.  You should be able to boot your install cd/dvd with
> "linux rescue" at the boot prompt to fix any of them, so don't panic yet.

OK but can I panic if this system has a max of two IDE devices, no floppy, one 
PCI slot, and I don't have a PCI CD, and it won't boot from USB even?

http://www.tyan.com/archive/products/html/gs10b2094_spec.html

It has a hardware Promise RAID controller but I have it on good authority that 
I don't want to mess with that, software RAID is better, etc.

I installed CentOS onto disk1 using the old 10-year old PC from the basement. 
Then I mirrored in the second disk. All went pretty well until the root 
partition was mirrored, fstab and /proc/mdstat and grub all agreed, and I 
rebooted. 

At this point Grub seems to work OK and the 3 RAID partitions 
(/boot, /home, /) assemble correctly and show UU. It gets all the way into 
INIT - set hostname, check for LVM, and then "Checking filesystems" and then 
tells me that /dev/md1 (should be /home) has a bad superblock:

  The superblock could not be read or does not describe a correct ext2
  filesystem.  If the device is valid and it really contains an ext2
  filesystem (and not swap or ufs or something else), then the superblock
  is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

Then I am dropped to a recovery shell.

but $ mdadm -E /dev/hd[ad]2 both show nice superblocks, that look OK to me.

in dmesg I see raidautorun output and then device-mapper starting up as the 
last two entires 

But, /dev/md1 doesn't exist. /dev/md0 and /dev/md2 are there and seem normal. 

/proc/mdstat contains this:

   md1: active raid1 hdd2[1] hda2[0]
        106012864 blocks [2/2] [UU]

so imagine my surprise when I tried this:

   # mdadm -Q /dev/md1
   mdadm: cannot open /dev/md1: No such file or directory

   # mdadm -Q /dev/hdd2
   /dev/hdd2: is not an md array
   /dev/hdd2: device 1 in 2 device active raid1 /dev/.tmp.md1. Use 
mdadm --examine for more detail.

But /dev/.tmp.md1 does not exist either, and therefore I can not stop this 
mystery array or fsck /dev/hd[ad]2 because they are "busy" being part of this 
non-existent device.

Arrgh. I be stumped.

Any help would be a big help ;)

Sam