On Thu, Oct 21, 2010 at 11:03:27AM -0700, Nataraj wrote:
fred smith wrote:
Thanks for the additional information.
I'll try backing up everything this weekend then will take a stab at it.
someone said earlier that the differing raid superblocks were probably the cause of the misassignment in the first place. but I have no clue how the superblocks could have become messed up, can any of you comment on that? willl I need to hack at that issue, too, before I can succeed?
thanks again!
Nataraj
I would first try adding the drives back in with:
mdadm /dev/mdN -a /dev/sdXn
Again, this is after having stopped the bogus md arrays.
Nataraj, that worked fine, didn't need to --force it. Now I'm back to having two devices in md0 and two in md1, and they're the RIGHT two! :) Put the box in single-user mode to do the work, then after the array finished resyncing, rebooted and it came up with the right two md devices.
I appreciate your tutoring me on this, you've been most helpful.
Thanks a bunch!
Oh, can you refer me to any good documentation on how to admin a software raid system? One aimed for people, like me, who are computer literate, but have never trained as a sysadmin, and who don't know much about RAID...
thanks again!
Fred
If that doesn't work, I would try assemble with a --force option, which might be a little more dangerous than the hot add, but probably not much. I can say that when I have a drive fall out of an array I am always able to add it back with the first command (-a). As I mentioned, I do have bitmaps on all my arrays, but you can't change that until you rebuild the raidset.
I believe these comands will take care of everything. You shouldn't have to do any diddling of the superblocks at a low level, and if the problem is that bad, you might be best to backup and recreate the whole array or engage the services of someone who knows how to muck with the data structures on the disk. I've never had to use anything other than mdadm to manage my raid arrays and I've never lost data with linux software raid in the 10 or more years that I've been using it. I've found it to be quite robust. Backing up is just a precaution that is a good idea for anyone to take if they care about their data.
If these problems reoccur on a regular basis, you could have a bad drive, a power supply problem or a cabling problem. Assuming your drives are attached to SATA, SCSI or SAS controller, you can use smartctl to check the drives and see if they are getting errors or other faults. smartctl will not work with USB or firefire attached drives.
Nataraj _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos