fred smith wrote:
Thanks for the additional information.
I'll try backing up everything this weekend then will take a stab at it.
someone said earlier that the differing raid superblocks were probably the cause of the misassignment in the first place. but I have no clue how the superblocks could have become messed up, can any of you comment on that? willl I need to hack at that issue, too, before I can succeed?
thanks again!
Nataraj
I would first try adding the drives back in with:
mdadm /dev/mdN -a /dev/sdXn
Again, this is after having stopped the bogus md arrays.
If that doesn't work, I would try assemble with a --force option, which might be a little more dangerous than the hot add, but probably not much. I can say that when I have a drive fall out of an array I am always able to add it back with the first command (-a). As I mentioned, I do have bitmaps on all my arrays, but you can't change that until you rebuild the raidset.
I believe these comands will take care of everything. You shouldn't have to do any diddling of the superblocks at a low level, and if the problem is that bad, you might be best to backup and recreate the whole array or engage the services of someone who knows how to muck with the data structures on the disk. I've never had to use anything other than mdadm to manage my raid arrays and I've never lost data with linux software raid in the 10 or more years that I've been using it. I've found it to be quite robust. Backing up is just a precaution that is a good idea for anyone to take if they care about their data.
If these problems reoccur on a regular basis, you could have a bad drive, a power supply problem or a cabling problem. Assuming your drives are attached to SATA, SCSI or SAS controller, you can use smartctl to check the drives and see if they are getting errors or other faults. smartctl will not work with USB or firefire attached drives.
Nataraj