Re: [CentOS] C7, mdadm issues

30 Jan 2019

      Alessandro Baggi wrote:
...
Il 30/01/19 14:02, mark ha scritto:
...
On 01/30/19 03:45, Alessandro Baggi wrote:
...
Il 29/01/19 20:42, mark ha scritto:
...
Alessandro Baggi wrote:
...
Il 29/01/19 18:47, mark ha scritto:
...
Alessandro Baggi wrote:
> Il 29/01/19 15:03, mark ha scritto:
>
>> I've no idea what happened, but the box I was working on
>> last week has a *second* bad drive. Actually, I'm starting
>> to wonder about that particulare hot-swap bay.
>>
>> Anyway, mdadm --detail shows /dev/sdb1 remove. I've added
>> /dev/sdi1...
>> but see both /dev/sdh1 and /dev/sdi1 as spare, and have yet
>> to find a reliable way to make either one active.
>>
>> Actually, I would have expected the linux RAID to replace a
>> failed one with a spare....
...
...
> can you report your raid configuration like raid level and
> raid devices and the current status from /proc/mdstat?
>
Well, nope. I got to the point of rebooting the system (xfs had
the RAID
volume, and wouldn't let go; I also commented out the RAID
volume.
It's RAID 5, /dev/sdb *also* appears to have died. If I do
mdadm --assemble --force -v /dev/md0  /dev/sd[cefgdh]1 mdadm:
looking for devices for /dev/md0 mdadm: /dev/sdc1 is identified
as a member of /dev/md0, slot 0.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot -1.
 mdadm: /dev/sde1 is identified as a member of /dev/md0, slot
2.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot -1.
 mdadm: no uptodate device for slot 1 of /dev/md0
mdadm: added /dev/sde1 to /dev/md0 as 2
mdadm: added /dev/sdf1 to /dev/md0 as 3
mdadm: added /dev/sdg1 to /dev/md0 as 4
mdadm: no uptodate device for slot 5 of /dev/md0
mdadm: added /dev/sdd1 to /dev/md0 as -1
mdadm: added /dev/sdh1 to /dev/md0 as -1
mdadm: added /dev/sdc1 to /dev/md0 as 0
mdadm: /dev/md0 assembled from 4 drives and 2 spares - not
enough to start the array.
--examine shows me /dev/sdd1 and /dev/sdh1, but that both are
spares.
Hi Mark,
please post the result from
cat /sys/block/md0/md/sync_action
There is none. There is no /dev/md0. mdadm refusees, saying that
it's lost too many drives.
mark

CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos
I suppose that your config is 5 drive and 1 spare with 1 drive
failed. It's strange that your spare was not used for resync.
Then you added a new drive but it does not start because it marks the
new disk as spare and you have a raid5 with 4 devices and 2 spares.
First I hope that you have a backup for all your data and don't run
some exotic command before backupping your data. If you can't backup
your data, it's a problem.
This is at work. We have automated nightly backups, and I do offline
backups of the backups every two weeks.
...
Have you tried to remove the last added device sdi1 and restart the
raid and force to start a resync?
The thing is, it had one? two? spares when /dev/sdb1 started dying, and
 it didn't use them.
...
Have you tried to remove this 2 devices and re-add only the device
that will be usefull for resync?  Maybe you can set 5 devices for your
 raid and not 6, if it works (after resync) you can add your spare
device growing your raid set.
I tried, and that's when I lost it (again), and it refuses to
assemble/start the RAID "not enough devices".
...
Reading on google many users use --zero-superblock before re-add the
device.
I can take one out, and re-add, but I think I'm going to have to
recreate the RAID again, and again restore from backup.
...
Other user reassemble the raid using --assume-clean but I don't know
what effect it will produces
Hope that someone give you a better help for this.
Update here if you got the solution.
Not that I'm into American football, but I seem to have pulled off what I
understand is called a hail-mary: *without* zeroing the superrblocks, I
did a create with all six good drives, excluding /dev/sdb1, and explicitly
told it one spare.
And the array is there, complete with data, with *one* spare, five good
drives, and it's currently rebuilding the spare.
The last resort worked, though we'll see how long.
mark

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] C7, mdadm issues