[CentOS] mdX and mismatch_cnt when building an array

Fri Nov 16 00:01:05 UTC 2012
Mike V <mike at vandv-systems.com>

Steve Thompson <smt at ...> writes:

> CentOS 6.3, x86_64.
> I have noticed when building a new software RAID-6 array on CentOS 6.3 
> that the mismatch_cnt grows monotonically while the array is building:
> # cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md11 : active raid6 sdg[5] sdf[4] sde[3] sdd[2] sdc[1] sdb[0]
>        3904890880 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/6] 
>        [==================>..]  resync = 90.2% (880765600/976222720) 
finish=44.6min speed=35653K/sec
> # cat /sys/block/md11/md/mismatch_cnt
> 1439285488
> The mismatch count grows until the assembly is complete, and then remains 
> at its highest value. A subsequent check resets it to zero (immediately) 
> and everything is fine thereafter. The device is not in use by any other 
> system component. I have reproduced this on several different systems; it 
> always happens with CentOS 6.3 and never with CentOS 5.x and earlier (in 
> 5.x, mismatch_cnt always stays at zero while assembling). I am using whole 
> drives in this example, but it's the same if I use partitions instead. The 
> count, size and type of drives appears to have no bearing.
> Perhaps just a curiosity, but I'm curious as to why it does this.
> Steve

This problem is not unique to CentOS -- I can confirm this same behavior on 
Ubuntu 12.10

mdv at ubuntu2:/sys/block/md0/md$ mdadm -V
mdadm - v3.2.5 - 18th May 2012
mdv at ubuntu2:/sys/block/md0/md$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] 
md0 : active raid6 sdm1[4] sdl1[3] sdj1[2] sdg1[1] sdf1[0]
      8790398976 blocks super 1.2 level 6, 2048k chunk, algorithm 2 [5/5] 
      [======>..............]  resync = 30.0% (879840768/2930132992) 
finish=978.5min speed=34918K/sec
      bitmap: 246/350 pages [984KB], 4096KB chunk, file: /var/lib/raid/md0bitmap

md1 : active raid6 sda[2] sdi1[1] sde1[4] sdd1[0] sdc1[3]
      5860534272 blocks level 6, 1024k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 11/15 pages [44KB], 65536KB chunk

unused devices: <none>
mdv at ubuntu2:/sys/block/md0/md$ cat mismatch_cnt
mdv at ubuntu2:/sys/block/md0/md$ cat rd?/errors
mdv at ubuntu2:/sys/block/md0/md$

Drives have been burned in without errors (/sbin/badblocks -v -w -b 4096), 
passed short and long SMART tests, and no SMART errors have been logged. No ata 
error messages in the kernel log files.

I've built this array several times with the same results. After a build 
completes, running 
root at ubuntu2:/sys/devices/virtual/block/md0/md# echo "repair" > sync_action
... wait 18 hours
root at ubuntu2:/sys/devices/virtual/block/md0/md# echo "check" > sync_action
... wait 18 more hours
will result in a mismatch_cnt of 0.