[CentOS] DegradedArray message

Thu Dec 4 13:45:29 UTC 2014
David McGuffey <davidmcguffey at verizon.net>

Thanks for all the responses.  A little more digging revealed:

md0 is made up of two 250G disks on which the OS and a very large /var
partions resides for a number of virtual machines.

md1 is made up of two 2T disks on which /home resides.

Challenge is that disk 0 of md0 is the problem and it has a 524M /boot
partition outside of the raid partition.

My plan is to back up /home (md1) and at a minimum /etc/libvirt
and /var/lib/libvirt (md0) before I do anything else.

Here are the log entries for 'raid'

Dec  1 20:50:15 desk4 kernel: md/raid1:md1: not clean -- starting
background reconstruction
Dec  1 20:50:15 desk4 kernel: md/raid1:md1: active with 2 out of 2
mirrors
Dec  1 20:50:15 desk4 kernel: md/raid1:md0: active with 1 out of 2
mirrors

This is a desktop, not a server. We've had several short (<20 sec) power
outages over the last month. The last one was on 1 Dec. I suspect the
sudden loss and restoration of power could have trashed a portion of
disk 0 in md0.

I finally obtained an APC UPS (BX1500G), installed, configured, and
tested it. In the future, it will carry me through these short outages.

I'll obtain a new 250G (or larger) drive and start rooting around for
guidance on how to replace a drive with the MBR and /boot on it.

On Wed, 2014-12-03 at 22:11 +0100, Leon Fauster wrote:
> Hi David,
> 
> Am 03.12.2014 um 02:14 schrieb David McGuffey <davidmcguffey at verizion.net>:
> > This is an automatically generated mail message from mdadm
> > running on desk4
> > 
> > A DegradedArray event had been detected on md device /dev/md0.
> > 
> > Faithfully yours, etc.
> > 
> > P.S. The /proc/mdstat file currently contains the following:
> > 
> > Personalities : [raid1] 
> > md0 : active raid1 dm-2[1]
> >      243682172 blocks super 1.1 [2/1] [_U]
> >      bitmap: 2/2 pages [8KB], 65536KB chunk
> > 
> > md1 : active raid1 dm-3[0] dm-0[1]
> >      1953510268 blocks super 1.1 [2/2] [UU]
> >      bitmap: 3/15 pages [12KB], 65536KB chunk
> 
> 
> the reason why one drive was kicked out (above [_U] ) will 
> be in /var/log/messages. If it is also part of md1 then 
> it should be manually removed from md1 before replacing the 
> hd. 
> 
> --
> LF
> 
> 
> 
> 
> 
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos