[CentOS] Some RAID-6 observations ... RHEL-6 vs CentOS-5.5

Thu Feb 10 18:36:46 UTC 2011
Chuck Munro <chuckm at seafoam.net>

Hello all,

In the process of building a new VM box to replace several individual 
CentOS servers, I've had the "interesting" experience of running both 
CentOS-5.5 and RHEL-6 (eval copy) as I build out the hardware based on a 
Supermicro motherboard.

A couple of observations regarding RAID-6:

- RAID-6 arrays created on RHEL-6 don't seem backward compatible with 
CentOS-5.5.  This may or may not be expected behaviour, but it makes me 
wonder if forward compatibility might be a problem for those of us 
migrating large RAID-6 arrays from CentOS-5 to -6 systems.  The main 
observations are that the md superblock is declared bad, and I get those 
boot-time messages claiming the available blocks are less than the 
configured size.  As a result, the RAID arrays are not made available on 
CentOS-5.5.

- A total of 8 RAID-6 arrays in RHEL-6 are built with 2TB data disks 
plus a hot spare.  However, every time I boot the system some, but not 
all, of the hot spare partitions are randomly missing.  Using mdadm to 
manually add the spare partition back into an array always works.  There 
are 4 array partitions per disk, and the largest single array is just 
over 2 TBytes of useable space.  There are two groups of 5+1 drives 
spread across two SAS/SATA controller cards, and this random behaviour 
can occur on either one.  Rebooting gets me a different set of missing 
hot spare partitions.  The only error messages I get are the mdmonitor 
"missing spares event" emails ... nothing in dmesg.

The host OS has its own RAID-1 (+ hot spare) arrays on a separate set of 
disks and controller, and those arrays always come up correctly.

I've checked and double ckecked the mdadm.conf file to make sure the 
ARRAY, DEVICE and 'devices=....' statements are correct.  Using mdadm to 
shut down an array and then reassemble it always seems to work properly, 
but something random is happening at boot-time assembly.  It's only the 
hot spare partitions that don't appear, never the data partitions.  Weird.

I'm hoping CentOS-6 doesn't present me with the same problem.  Because 
I'm not a registered RHEL user, I don't have the ability to submit a bug 
report at RedHat.

Chuck