[CentOS] RAID questions

Sat Feb 18 04:25:14 UTC 2017
Keith Keller <kkeller at wombat.san-francisco.ca.us>

On 2017-02-17, John R Pierce <pierce at hogranch.com> wrote:
> On 2/16/2017 9:18 PM, Keith Keller wrote:
>>> Only some systems support that sort of restriping, and its a dangerous
>>> activity (if the power fails or system crashes midway through the
>>> restriping operation, its probably not restartable, you quite likely
>>> will lose the whole volume)
>> Doesn't mdraid support changing RAID levels?  I think it will even do it
>> reasonably safely (though still better not to have a power failure!).  I
>> have a vague memory of adding a drive to a RAID5 and converting it to a
>> RAID6 but I could be misremembering.
>
> any such operation requires the entire raid to be re-slivered, stripe by 
> stripe, as ALL the data moves around. on a large raid made from 
> multi-terabyte drives, this would take DAYS.

Yes, it would take a long time, but the man page for mdadm implies that
it's reasonably safe (sorry for long lines):

       Changing the number of active devices in a RAID5 or RAID6 is much more effort.  Every  block  in
       the  array will need to be read and written back to a new location.  From 2.6.17, the Linux Ker-
       nel is able to increase the number of devices in a RAID5 safely, including restarting an  inter-
       rupted  "reshape".   From 2.6.31, the Linux Kernel is able to increase or decrease the number of
       devices in a RAID5 or RAID6.
[...]
       When  relocating  the first few stripes on a RAID5 or RAID6, it is not possible to keep the data
       on disk completely consistent and crash-proof.  To provide the required safety,  mdadm  disables
       writes  to  the  array while this "critical section" is reshaped, and takes a backup of the data
       that is in that section.  For grows, this backup may be stored in any  spare  devices  that  the
       array  has,  however  it  can also be stored in a separate file specified with the --backup-file
       option, and is required to be specified for shrinks, RAID level changes and layout changes.   If
       this option is used, and the system does crash during the critical period, the same file must be
       passed to --assemble to restore the backup and reassemble the array.  When shrinking rather than
       growing the array, the reshape is done from the end towards the beginning, so the "critical sec-
       tion" is at the end of the reshape.

(Thanks to Gordon for the pointer to the GROW section of the mdam man page.)

It's been a long time since I did this, but I seem to remember resizing
an md array of ~10 2TB drives in a RAID5 by adding one drive and reshaping
to RAID6, and it took 2-3 days.

The old 3ware controllers claimed to be able to support this sort of reshaping,
but I only tried once and it failed.  I don't know if LSI or Areca
supports it.

--keith

-- 
kkeller at wombat.san-francisco.ca.us