[CentOS] more software raid questions

Thu Oct 21 16:13:45 UTC 2010

On Thu, Oct 21, 2010 at 08:59:13AM -0700, Nataraj wrote:
> fred smith wrote:
> > On Tue, Oct 19, 2010 at 07:34:19PM -0700, Nataraj wrote:
> >   
> >>
> >> I've seen this kind of thing happen when the autodetection stuff 
> >> misbehaves. I'm not sure why it does this or how to prevent it. Anyway, 
> >> to recover, I would use something like:
> >>
> >> mdadm --stop /dev/md125
> >> mdadm --stop /dev/md126
> >>
> >> If for some reason the above commands fail, check and make sure it has 
> >> not automounted the file systems from md125 and md126. Hopefully this 
> >> won't happen.
> >>
> >> Then use:
> >> mdadm /dev/md0 -a /dev/sdXX
> >> To add back the drive which belongs in md0, and similar for md1. In 
> >> general, it won't let you add the wrong drive, but if you want to check use:
> >> mdadm --examine /dev/sda1 | grep UUID
> >> and so forth for all your drives and find the ones with the same UUID.
> >>     
> >
> > Well, I've already tried to use --fail and --remove on md125 and md126
> > but I'm told the members are still active.
> >
> > mdadm /dev/md126 --fail /dev/sdb1 --remove /dev/sdb1
> > mdadm /dev/md125 --fail /dev/sdb2 --remove /dev/sdb2
> >   
> You want to use --stop for the md125 and md126. Those are the raid 
> devices that are not correct. Once they are stopped, you can take the 
> drives from them and return them to md0 and md1 where they belong.!

> 
> You will need to add the correct drive that was originally paired in 
> each raid set, but as I mentioned, it won't let you add the wrong 
> drives, so just try adding sdb1 to md0, then if it doesn't work, add it 
> to sdb1. You can't fail out drives from arrays that only have one drive.

Thanks for the additional information.

I'll try backing up everything this weekend then will take a stab at it.

someone said earlier that the differing raid superblocks were probably
the cause of the misassignment in the first place. but I have no clue
how the superblocks could have become messed up, can any of you comment
on that? willl I need to hack at that issue, too, before I can succeed?

thanks again!

> 
> Nataraj
> > 	mdadm /dev/md126 --fail /dev/sdb1 --remove /dev/sdb1
> > 	mdadm: set /dev/sdb1 faulty in /dev/md126
> >
> >
> > 	mdadm: hot remove failed for /dev/sdb1: Device or resource busy
> >
> > with the intention of then re-adding them to md0 and md1.
> >
> > so I tried:
> >
> > mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
> > and got a similar message. 
> >
> > at which point I knew I was in over my head.
> >
> >   
> >> When I create my Raid arrays, I always use the option --bitmap=internal. 
> >> With this option set, a bitmap is used to keep track of which pages on 
> >> the drive are out of date and then you only resync pages which need 
> >> updating instead of recopying the whole drive when this happens. In the 
> >> past I once added a bitmap to an existing raid1 array using something 
> >> like this. This may not be the exact command, but I know it can be done:
> >> mdadm /dev/mdN --bitmap=internal
> >>
> >> Adding the bitmap is very worthwhile and saves time and risk of data 
> >> loss by not having to recopy the whole partition.
> >>
> >> Nataraj

-- 
---- Fred Smith -- fredex at fcshome.stoneham.ma.us -----------------------------
                    The Lord detests the way of the wicked 
                  but he loves those who pursue righteousness.
----------------------------- Proverbs 15:9 (niv) -----------------------------