[CentOS] how to replace a raid drive with mdadm

Sat May 10 17:06:34 UTC 2014

On 2014-05-10, CS_DBA <cs_dba at consistentstate.com> wrote:
>
> If we loose a drive in a raid 10 array (mdadm software raid) what are 
> the steps needed to correctly do the following:
> - identify which physical drive it is

This is controller dependent.  Some support blinking the drive light to
identify it, others do not.  If yours does not you need to jury-rig
something (e.g., either physically label the drive slot/drive, or send
some dummy data to the drive to get it to blink).

> - replace the drive

The md part is easy.  If md hasn't failed the drive already, then
you need to do that first:

mdadm /dev/mdN --fail /dev/sdXX

Then remove it from the array:

mdadm /dev/mdN --remove /dev/sdXX

The physical part is, again, hardware dependent.

> - add the new drive to the array and force it to re-sync

Again, physical part hardware dependent.  Once the kernel knows about
your new drive, this should work (partition the drive if needed
beforehand):

mdadm /dev/mdN --add /dev/sdYY

There may be extra parameters for replacing a failed RAID10 drive, but I
suspect that md already knows the needed parameters, so just adding the
drive should kick off a rebuild of the failed member.

-- 
kkeller at wombat.san-francisco.ca.us