[CentOS] Replacing failed software RAID drive

Sun Oct 7 21:57:49 UTC 2007
Hugh E Cruickshank <hugh at forsoft.com>

CentOS release 4.5

Hi All:

First of all I will admit to being spoiled by my MegaRAID SCSI RAID
controllers. When a drive fails on one of them I just replace the
drive and carry on with out having to do anything else.

I now find myself in the situation where I have a failed drive on a
non-MegaRAID controller, specifically an Adaptec 29160 SCSI controller.
The system is an Acer G700 with 8 internal hot-swappable SCSI drives
arranged in two banks of 4 drives. Each bank is connected to a 
separate channel on the 29160 controller. When I installed CentOS 4
I enable software mirroring between the two banks so that I ended up
with 4 pairs of mirrored drive (sda/sde, sdb/sdf, sdc/sdg, sdd/sdh).

The problem I have now is that it is sda (the boot drive) that has
failed. I have not encountered this problem before and therefore I
need to make sure that I understand what I need to do before I start
mucking around with things and dig myself into a deeper hole.

I have spent much time attempting to research the problem but have not
been able to come with any definite information to help. As far as I
can see I have two options...

Option 1: Leave the system running and replace the drive. Then either
the RAID software will re-sync the drives or I can manually sync them
with mdadm. I have not seen anything that will support this option
but I am hoping that it is a valid option.

Option 2: Create a boot disk (floppy or CD) that I can boot from but
that points to sde (the boot mirror). Shutdown the system and replace 
the failed sda drive. Boot from the new boot disk. Format, partition
and re-sync the new sda from sde. Shutdown, remove the boot disk, and
reboot from the new sda.

Can anyone confirm either of these options and point me in the right
direction to any documentation that would assist me.

TIA

Regards, Hugh

-- 
Hugh E Cruickshank, Forward Software, www.forward-software.com