[CentOS] C7, mdadm issues

Thu Jan 31 08:23:39 UTC 2019
Alessandro Baggi <alessandro.baggi at gmail.com>

Il 31/01/19 07:34, Simon Matter ha scritto:
>> Il 30/01/19 16:49, Simon Matter ha scritto:
>>>> On 01/30/19 03:45, Alessandro Baggi wrote:
>>>>> Il 29/01/19 20:42, mark ha scritto:
>>>>>> Alessandro Baggi wrote:
>>>>>>> Il 29/01/19 18:47, mark ha scritto:
>>>>>>>> Alessandro Baggi wrote:
>>>>>>>>> Il 29/01/19 15:03, mark ha scritto:
>>>>>>>>>
>>>>>>>>>> I've no idea what happened, but the box I was working on last
>>>>>>>>>> week
>>>>>>>>>> has a *second* bad drive. Actually, I'm starting to wonder about
>>>>>>>>>> that particulare hot-swap bay.
>>>>>>>>>>
>>>>>>>>>> Anyway, mdadm --detail shows /dev/sdb1 remove. I've added
>>>>>>>>>> /dev/sdi1...
>>>>>>>>>> but see both /dev/sdh1 and /dev/sdi1 as spare, and have yet to
>>>>>>>>>> find
>>>>>>>>>> a reliable way to make either one active.
>>>>>>>>>>
>>>>>>>>>> Actually, I would have expected the linux RAID to replace a
>>>>>>>>>> failed
>>>>>>>>>> one with a spare....
>>>>>>
>>>>>>>>> can you report your raid configuration like raid level and raid
>>>>>>>>> devices
>>>>>>>>> and the current status from /proc/mdstat?
>>>>>>>>>
>>>>>>>> Well, nope. I got to the point of rebooting the system (xfs had the
>>>>>>>> RAID
>>>>>>>> volume, and wouldn't let go; I also commented out the RAID volume.
>>>>>>>>
>>>>>>>> It's RAID 5, /dev/sdb *also* appears to have died. If I do
>>>>>>>> mdadm --assemble --force -v /dev/md0  /dev/sd[cefgdh]1 mdadm:
>>>>>>>> looking
>>>>>>>> for
>>>>>>>> devices for /dev/md0 mdadm: /dev/sdc1 is identified as a member of
>>>>>>>> /dev/md0, slot 0.
>>>>>>>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot -1.
>>>>>>>> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
>>>>>>>> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
>>>>>>>> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 4.
>>>>>>>> mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot -1.
>>>>>>>> mdadm: no uptodate device for slot 1 of /dev/md0
>>>>>>>> mdadm: added /dev/sde1 to /dev/md0 as 2
>>>>>>>> mdadm: added /dev/sdf1 to /dev/md0 as 3
>>>>>>>> mdadm: added /dev/sdg1 to /dev/md0 as 4
>>>>>>>> mdadm: no uptodate device for slot 5 of /dev/md0
>>>>>>>> mdadm: added /dev/sdd1 to /dev/md0 as -1
>>>>>>>> mdadm: added /dev/sdh1 to /dev/md0 as -1
>>>>>>>> mdadm: added /dev/sdc1 to /dev/md0 as 0
>>>>>>>> mdadm: /dev/md0 assembled from 4 drives and 2 spares - not enough
>>>>>>>> to
>>>>>>>> start the array.
>>>>>>>>
>>>>>>>> --examine shows me /dev/sdd1 and /dev/sdh1, but that both are
>>>>>>>> spares.
>>>>>>> Hi Mark,
>>>>>>> please post the result from
>>>>>>>
>>>>>>> cat /sys/block/md0/md/sync_action
>>>>>>
>>>>>> There is none. There is no /dev/md0. mdadm refusees, saying that it's
>>>>>> lost
>>>>>> too many drives.
>>>>>>
>>>>>>          mark
>>>>>>
>>>>>> _______________________________________________
>>>>>> CentOS mailing list
>>>>>> CentOS at centos.org
>>>>>> https://lists.centos.org/mailman/listinfo/centos
>>>>>>
>>>>>
>>>>>
>>>>> I suppose that your config is 5 drive and 1 spare with 1 drive failed.
>>>>> It's strange that your spare was not used for resync.
>>>>> Then you added a new drive but it does not start because it marks the
>>>>> new disk
>>>>> as spare and you have a raid5 with 4 devices and 2 spares.
>>>>>
>>>>> First I hope that you have a backup for all your data and don't run
>>>>> some
>>>>> exotic command before backupping your data. If you can't backup your
>>>>> data,
>>>>> it's a problem.
>>>>
>>>> This is at work. We have automated nightly backups, and I do offline
>>>> backups
>>>> of the backups every two weeks.
>>>>>
>>>>> Have you tried to remove the last added device sdi1 and restart the
>>>>> raid
>>>>> and
>>>>> force to start a resync?
>>>>
>>>> The thing is, it had one? two? spares when /dev/sdb1 started dying, and
>>>> it
>>>> didn't use them.
>>>
>>> For many years now I'm only doing RAID1 now because it's just safer then
>>> RAID5 and easier than RAID6 if the number of disks is low.
>>>
>>
>> Like you, I run always raid1 but in the last year I run a raid5 with 3tb
>> wd red for my personal backup server but never got an error for the time.
>>
>> What about RAID10 vs RAID5, RAID6? You loss half size but is performant
>> as raid5 e reliable as raid1.
> 
> I did RAID10 in the past but don't do it now. If you do large linear
> read/writes, RAID10 may perform better, if you have lots of independent
> and random read/writes, RAID1 may perform better. It really depends a lot
> on how the disk are used.
> 
>>
>> Have you tried other type of raid like RAID50 or RAID60?
> 
> Yes I did in the past it even adds more complexity than I like.
> 
>>
>> About resync process, all type of raid level are disk killer during this
>> procedure or only raid5 (and similar) is a disk killer?
> 
> I don't call it a disk killer, it's more that it detects disks errors but
> does not produce them.
> 
>>
>>> I also don't have much experience with spare handling as I also don't do
>>> it in my scenarios.
>>>
>>> However in general, I think the problem today is this:
>>> We have very large disks these days. Defects on a disk are often not
>>> found
>>> for a long time. Even with raid-check, I think it doesn't find errors
>>> which only happen while writing, not while reading only.
>>>
>>> So now, if one disk fails, things are still okay. Then, when a spare is
>>> in
>>> place or the defective disk was replaced, the resync starts. Now, if
>>> there
>>> is any error on one of the old disks while the resync happens, boom, the
>>> array fails and is in a bad state now.
>>>
>>> I once had to recover a broken RAID5 from some linux based NAS and what
>>> I
>>> did was:
>>> * Dump the complete raid partition from every disk to a file, ignoring
>>> the
>>> read errors on one of the disks.
>>> * Build the RAID5 like this:
>>>
>>> mdadm --create --assume-clean --level=5 --raid-devices=4
>>> --spare-devices=0 \
>>>     --metadata=1.0 --layout=left-symmetric --chunk=64 --bitmap=none \
>>>     /dev/md10 /dev/loop0 missing /dev/loop2 /dev/loop3
>>>
>>> * Recover 99.9% of the data from /dev/md10.
>>>
>>
>> Why not recover directly from  backup? This saves time.
>>   From your last command why you inserted /dev/loopN?
> 
> I that case, the owner of the NAS was a photographer who had all his past
> work on the NAS with no real backup :-(
> 
> What I did in that case was to dump all data from all disks of the array
> to files. Then I made copies of the original dump files to work with them.
> I didn't want to touch the disks more than needed.
> 
>>
>>> One more hint for those interested:
>>> Even with RAID1, I don't use the whole disk as one big RAID1. Instead, I
>>> slice it into equally sized parts - not physically :-) - and create
>>> multiple smaller RAID1 arrays on it. If a disk is 8TB, I create 8
>>> paritions of 1TB and then create 8 RAID1 arrays on it. Then I add all 8
>>> arrays to the same VG. Now, if there is a small error in, say, disk 3,
>>> only a 1TB slice of the whole 8TB is degraded. In large arrays you can
>>> even keep some spare slices on a spare disk to temporary move broken
>>> slices. You get the idea, right?
>>>
>>
>> About this type of configuration if you have 2 disks and create 8 raid1
>> on this two disks, you won't lose performances? As you said if in a
> 
> Performance is the same, with maybe 0.1% overhead.
> 
>> single partition you got some bad error you save other data but if one
>> disk fails totally you had the same problem more you need to recreate 8
> 
> That's true, but in almost three decades of work with harddisks, complete
> disk failures were rarely seen.
> 
>> partition, resync 8 raid1. This could require more time to recovery and
>> possibly more human error.
> 
> That's true about human errors. But in this case, I usually create small
> scripts to do it, and I really look at those scripts very carefully before
> I run them :-)
> 
> Regards,
> Simon
> 
> 

Hi Simon,
thank you for your reply.

Best regards,
Alessandro.