[CentOS] trouble booting the system with I2O hardware RAID

Mon Apr 11 16:23:07 UTC 2005
Aleksandar Milivojevic <amilivojevic at pbl.ca>

I've just made (yet another) CentOS 4 installation.  The install process 
seems to go fine, however the machine doesn't wan't to boot.

The system in question has one of I2O Adaptec RAID controllers.  I've 
configured LVM with one volume group and several volumes.  If I boot 
into the rescue mode, all looks fine and dandy.  Anaconda finds the 
installation, and I can access all volumes.

However, when doing "real" boot, it gets into trouble.  All required 
modules are loaded from initrd image (as far as I can tell).  The I2O 
modules are able to locate the RAID devices (I see all partitions 
reported: /dev/i2o/hda1 (unused), /dev/i2o/hdb1 (/boot), and 
/dev/i2o/hdb2 (rest of the system under LVM).  The only thing different 
from rescue mode is that i2o/hda and i2o/hdb are reversed (this is 
strange, but it shouldn't affect things since /boot partition has a 
label "/boot", and all the rest is under LVM, so everything should be 
device name independent).  I have no idea why i2o device drivers behave 
differently when loaded from initrd image during boot, and by Anaconda 
during installation.

The last couple of messages printed on the screen are:

Creating root device
Mounting root file system
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
mount: error 2 mounting none
Switching to new root
WARNING: can't access (null)
exec of init ((null)) failed!!!: 14
umount /initrd/dev failed: 2
Kernel panic - not syncing: Attempted to kill init!

Looking at the "init" script from initrd image, this correspond to:

echo Mounting root filesystem
mount -o defaults --ro -t ext3 /dev/root /sysroot
mount -t tmpfs --bind /dev /sysroot/dev
echo Switching to new root
switchroot /sysroot
umount /initrd/dev

Which would indicate that mount of root file system went OK, but then it 
failed to mount /dev filesystem (basically, move already mounted /dev to 
/sysroot/dev).  After the switchroot /sysroot, old /dev mount point 
become invalid (non-accessible), the new /dev mount point was not there 
and of course everything broke from that point on.

I've Googled around a bit, and the only relevant thing Google gave me 
was this French page.  There were couple of more pages with similar but 
different problem (modules failing to load and/or detect disk drives, 
which is not the case here, all modules were loaded correctly as 
witnessed by successfull LVM initialization and successfull root file 
system mount).

http://www.fedora-france.org/modules/newbb/viewtopic.php?topic_id=3838&forum=6&post_id=20970

I do live in Canada, but don't speak a word of French (shame on me, but 
in my defense it is on my todo list).  However I did managed to figure 
out somebody suggested going with Grub instead of LILO.  IMO, Grub or 
LILO shouldn't make any difference, since the error is happening way 
after boot loader did its job.  Anyhow, just for fun, I reinstalled the 
system from scratch, this time choosing Grub as boot loader of choice to 
be installed into MBR.  However, for whatever reason, Anaconda did not 
install Grub (dd & less showed no signs of Grub in MBR).  Boot into the 
rescue, chroot, grub-install, OK now I have Grub in MBR.  But again, no 
joy.  Grub doesn't even start and system simply hangs in mid-air.  No 
errors printed, no anything.

Currently, I'm kind of stuck and idea-less.  The system did worked 
perfectly in the past with Red Hat 7.3 (and LILO as boot loader), and 
exactly the same hardware RAID configuration (two volumes, one for 
system, one for data).  Any help, hint, etc would be greatly appriciated.

-- 
Aleksandar Milivojevic <amilivojevic at pbl.ca>    Pollard Banknote Limited
Systems Administrator                           1499 Buffalo Place
Tel: (204) 474-2323 ext 276                     Winnipeg, MB  R3T 1L7