[CentOS] Software RAID1 Failure Help

Fri Feb 7 23:51:59 UTC 2014
SilverTip257 <silvertip257 at gmail.com>

On Fri, Feb 7, 2014 at 5:47 PM, Matt <matt.mailinglists at gmail.com> wrote:

> I am running software RAID1 on a somewhat critical server.  Today I
> noticed one drive is giving errors.  Good thing I had RAID.  I planned
> on upgrading this server in next month or so.  Just wandering if there
> was an easy way to fix this to avoid rushing the upgrade?  Having a
> single drive is slowing down reads as well, I think.
> Thanks.
Maybe it is slowing things down, but I would recommend you fix your RAID1
mirror to avoid losing all your data.

Hopefully the information below helps you...

If you have hotswap drives/caddies, then you should be able to remove the
drives while the server continues running.  First, hot fail and hot remove
[0] all raid members for that drive /dev/sdb from any software raid arrays
you have.  Next step is to remove the drive from the SCSI subsystem [1].
 Next step is to physically remove the drive and replace it with healthy
one.  Make the OS detect the new drive [2].  From there, you can use sfdisk
to clone the partition structure from the working drive to the new one.
 Then add the new partitions to your software raid arrays (and watch
/proc/mdstat as it rebuilds).

-f or --fail
-r or --remove
-a or --add

mdadm /dev/mdX -f /dev/sdbY
mdadm /dev/mdX -r /dev/sdbY

sfdisk -d /dev/sda | sfdisk /dev/sdb

mdadm /dev/mdX -a /dev/sdbY

watch /proc/mdstat

[0] http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/

//  SilverTip257  //