[CentOS] [SOLVED] RAID5 suddenly broken

Thu Aug 18 08:36:50 UTC 2011
Mathieu Baudier <mbaudier at argeo.org>

> [root at livecd ~]# mdadm --misc -E /dev/md0
> mdadm: No md superblock detected on /dev/md0.
> [root at livecd ~]# mdadm --misc -Q /dev/md0
> /dev/md0: is an md device which is not active
> /dev/md0: No md super block found, not an md component.
> [root at livecd ~]# mdadm --misc -D /dev/md0
> mdadm: md device /dev/md0 does not appear to be active.

I could fix the issue.

Since the information on internet is a bit messy and scary here is a
summary of the problem and of the solution, for future reference:

## PROBLEM
Due to some failure related to suspend the RAID5 array became inconsistent.
The symptom was that the superblock (which allow auto-configuration of
the RAID array) was not recognized.

But the underlying member partitions were still recognized as per the
following command:
> [root at livecd ~]# mdadm -E /dev/sd*3
which provided details about their states

## SOLUTION
(detailed output of the commands at the end of the post)

# Assemble the array with force option
mdadm -v --assemble --force /dev/md0 /dev/sd{a,b,c,d}3

# NOTE: assembling without force option was not enough
[root at livecd ~]# mdadm -v --assemble /dev/md0 /dev/sd{a,b,c,d}3
...
mdadm: /dev/md0 assembled from 2 drives - not enough to start the
array while not clean - consider --force.

# Check state
mdadm -D /dev/md0
...
    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed
       2       8       51        2      active sync   /dev/sdd3

# Add missing partitions to the array
mdadm /dev/md0 -a /dev/sdb3
mdadm /dev/md0 -a /dev/sdc3

# Check that it is now OK
mdadm -D /dev/md0
...
    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      spare rebuilding   /dev/sdb3
       2       8       51        2      active sync   /dev/sdd3

       3       8       35        -      spare   /dev/sdc3

# and watch it rebuild:
...
 Rebuild Status : 3% complete

Many thanks to the CentOS LiveCD team, they saved the day.

Cheers,

Mathieu

## DETAILED OUTPUT

[root at livecd ~]# mdadm -v --assemble --force /dev/md0 /dev/sd{a,b,c,d}3
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda3 is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb3 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc3 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdd3 is identified as a member of /dev/md0, slot 2.
mdadm: added /dev/sdc3 to /dev/md0 as 1
mdadm: added /dev/sdd3 to /dev/md0 as 2
mdadm: added /dev/sda3 to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives (out of 3).

[root at livecd ~]# mdadm -D /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Tue Dec  1 12:01:05 2009
     Raid Level : raid5
     Array Size : 409592832 (390.62 GiB 419.42 GB)
  Used Dev Size : 204796416 (195.31 GiB 209.71 GB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Aug 17 14:47:36 2011
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 256K

           UUID : 7533411a:f066a145:1e89d48e:1a8374a3
         Events : 0.38857

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed
       2       8       51        2      active sync   /dev/sdd3


[root at livecd ~]# mdadm /dev/md0 -a /dev/sdb3
mdadm: re-added /dev/sdb3
[root at livecd ~]# mdadm /dev/md0 -a /dev/sdc3
mdadm: added /dev/sdc3
[root at livecd ~]# mdadm -D /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Tue Dec  1 12:01:05 2009
     Raid Level : raid5
     Array Size : 409592832 (390.62 GiB 419.42 GB)
  Used Dev Size : 204796416 (195.31 GiB 209.71 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Aug 17 14:47:36 2011
          State : clean, degraded, recovering
 Active Devices : 2
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 2

         Layout : left-symmetric
     Chunk Size : 256K

 Rebuild Status : 0% complete

           UUID : 7533411a:f066a145:1e89d48e:1a8374a3
         Events : 0.38857

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      spare rebuilding   /dev/sdb3
       2       8       51        2      active sync   /dev/sdd3

       3       8       35        -      spare   /dev/sdc3