Hugh E Cruickshank wrote:
I now find myself in the situation where I have a failed drive on a non-MegaRAID controller, specifically an Adaptec 29160 SCSI controller. The system is an Acer G700 with 8 internal hot-swappable SCSI drives arranged in two banks of 4 drives. Each bank is connected to a separate channel on the 29160 controller. When I installed CentOS 4 I enable software mirroring between the two banks so that I ended up with 4 pairs of mirrored drive (sda/sde, sdb/sdf, sdc/sdg, sdd/sdh).
Normally with software mirroring you would mirror partitions, not drives. What does "cat /proc/mdstat" say about them?
The problem I have now is that it is sda (the boot drive) that has failed. I have not encountered this problem before and therefore I need to make sure that I understand what I need to do before I start mucking around with things and dig myself into a deeper hole.
I have spent much time attempting to research the problem but have not been able to come with any definite information to help. As far as I can see I have two options...
Option 1: Leave the system running and replace the drive. Then either the RAID software will re-sync the drives or I can manually sync them with mdadm. I have not seen anything that will support this option but I am hoping that it is a valid option.
This should work, but you'll probably have to tell the controller that you are removing and adding disks. This used to be done by writing something to /proc/scsi/scsi, but it may have changed and also may be controller specific so I'll let someone else point out the documentation for that.
Option 2: Create a boot disk (floppy or CD) that I can boot from but that points to sde (the boot mirror). Shutdown the system and replace the failed sda drive. Boot from the new boot disk. Format, partition and re-sync the new sda from sde. Shutdown, remove the boot disk, and reboot from the new sda.
You have an odd combination of drives... Normally you would want to mirror the partitions on the first 2 disks and install grub on both, in which case the system would still boot. Some of the more sophisticated controllers can boot from more than the first 2, though. Anyway, you should be able to boot from your install CD with 'linux rescue' at the boot prompt and get to a point where you can fix things.