[CentOS] Drive failed in 4-drive md RAID 10

Fri Sep 18 19:53:37 UTC 2020
Simon Matter <simon.matter at invoca.ch>

> I got the email that a drive in my 4-drive RAID10 setup failed. What are
> my
> options?
>
> Drives are WD1000FYPS (Western Digital 1 TB 3.5" SATA).
>
> mdadm.conf:
>
> # mdadm.conf written out by anaconda
> MAILADDR root
> AUTO +imsm +1.x -all
> ARRAY /dev/md/root level=raid10 num-devices=4
> UUID=942f512e:2db8dc6c:71667abc:daf408c3
>
> /proc/mdstat:
> Personalities : [raid10]
> md127 : active raid10 sdf1[2](F) sdg1[3] sde1[1] sdd1[0]
>       1949480960 blocks super 1.2 512K chunks 2 near-copies [4/3] [UU_U]
>       bitmap: 15/15 pages [60KB], 65536KB chunk
>
> smartctl reports this for sdf:
> 197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always
> -       1
> 198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline
> -       6
>
> So it's got 6 bad blocks, 1 pending for remapping.
>
> Can I clear the error and rebuild? (It's not clear what commands would do
> that.) Or should I buy a replacement drive? I'm considering a WDS100T1R0A

Hi,

mdadm --remove /dev/md127 /dev/sdf1

and then the same with --add should hotremove and add dev device again.

If it rebuilds fine it may again work for a long time.

Simon

> (2.5" 1TB red drive), which Amazon has for $135, plus the 3.5" adapter.
>
> The system serves primarily as a home mail server (it fetchmails from an
> outside VPS serving as my domain's MX) and archival file server.
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
>