Hello Everyone,
First time CentOS poster :) I have CentOS 4 installed on the head node of a Sun Gridengine cluster set up in a RAID. The head node has four hard drives, and I assume that drives 1 and 2 are in a raid and then drives 3 and 4 are in another raid. I was trying to expand the OS partition on drive 1 because it was full. I took drive 1 out, put it in my Fedora 8 box as a secondary drive, booted up into Fedora, and saw it had the partition structure: / 8GB /var 4GB /swap 1GB and an "unknown" partition 101.4GB
I did a cp -rfa on the / and /var files for a backup (I know, not the best way). Restarted my Fedora into Windows to take a look at it using Paragon Partition Manger. Restarted into Fedora and using gparted, formatted the "unknown" partition as ext3 - I think that is where I made my fatal mistake - and moved the /swap to the middle of the drive, moved the /var to the middle and expanded to 10GB, and then expanded / to about 50GB to fill up the rest.
I had also took drive 2 out of the head node and into my Fedora, and saw it had the partition structure: /swap 15GB and an "unknown" partition 101.4GB
Ok, now when I put everything back into the head node, and reboot, the BIOS sees all four drives, and from what I can tell, recognizes the first raid (of drives 3 and 4), but says it can only find one disk for the second raid (drives 1 and 2). I can't find any way around this.
Looking at my /etc/raidtab file: raiddev /dev/md0 raid-level 1 nr-raid-disks 2 nr-spare-disks 0 persistent-superblock 1 device /dev/sdc1 raid-disk 0 device /dev/sdd1 raid-disk 1
raiddev /dev/md1 raid-level 0 nr-raid-disks 2 persistent-superblock 1 chunk-size 4 device /dev/sda4 raid-disk 0 device /dev/sdb2 raid-disk 1
It says it can bring up md0 ok, but not md1. Right now, I am going to try to restore the "unknown" partition that I deleted from drive 1 using the "unknown" partition from drive 2.
Any ideas on how to get myself out of this mess? I feel like I really messed it up good. This is a server for our work, and we have a couple years worth of data on it, so I would really like to fix it rather than reinstall.
Thank you greatly for any help! Jeff Sadino
Jeff Sadino wrote:
Hello Everyone,
First time CentOS poster :) I have CentOS 4 installed on the head node of a Sun Gridengine cluster set up in a RAID. The head node has four hard drives, and I assume that drives 1 and 2 are in a raid and then drives 3 and 4 are in another raid. I was trying to expand the OS partition on drive 1 because it was full. I took drive 1 out, put it in my Fedora 8 box as a secondary drive, booted up into Fedora, and saw it had the partition structure: / 8GB /var 4GB /swap 1GB and an "unknown" partition 101.4GB
...
probably LVM.
wow, you made a nice mess.
Do you think I can copy the unknown partition from the second drive onto the first drive and have everything work again?
On Wed, Mar 3, 2010 at 4:01 PM, John R Pierce pierce@hogranch.com wrote:
Jeff Sadino wrote:
Hello Everyone,
First time CentOS poster :) I have CentOS 4 installed on the head node of a Sun Gridengine cluster set up in a RAID. The head node has four hard drives, and I assume that drives 1 and 2 are in a raid and then drives 3 and 4 are in another raid. I was trying to expand the OS partition on drive 1 because it was full. I took drive 1 out, put it in my Fedora 8 box as a secondary drive, booted up into Fedora, and saw it had the partition structure: / 8GB /var 4GB /swap 1GB and an "unknown" partition 101.4GB
...
probably LVM.
wow, you made a nice mess.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Jeff Sadino wrote:
Do you think I can copy the unknown partition from the second drive onto the first drive and have everything work again?
if there's a md mirror, god knows what will happen when you boot it up with both drives present, it oculd decide to mirror the unformatted partition to the original, I've seen worse happen when the juju is messed with.
I think I'd totally wipe the drive you messed with, like zero it. Then boot up the system in single user with just the one drive you didn't hose, and BACK UP EVERYTHING ON IT TO EXTERNAL SAFE MEDIA. use dump or something to do the backup that does a proper inode level backup of each volume. then add the wiped/blank drive, and remirror it with the mdadm commands to repair the raid, then reboot to normal mode and see if its all safe.
On Thursday, March 04, 2010 10:09 AM, John R Pierce wrote:
Jeff Sadino wrote:
Do you think I can copy the unknown partition from the second drive onto the first drive and have everything work again?
if there's a md mirror, god knows what will happen when you boot it up with both drives present, it oculd decide to mirror the unformatted partition to the original, I've seen worse happen when the juju is messed with.
Well, he messed up a stripe. Zero chance of recovery.
I think I'd totally wipe the drive you messed with, like zero it. Then boot up the system in single user with just the one drive you didn't hose, and BACK UP EVERYTHING ON IT TO EXTERNAL SAFE MEDIA. use dump or something to do the backup that does a proper inode level backup of each volume. then add the wiped/blank drive, and remirror it with the mdadm commands to repair the raid, then reboot to normal mode and see if its all safe.
Won't apply. He can kiss md1 good bye.
Thanks for the insight. Is there any way to bring it back to life not necessarily as a raid, but just back up so I can get to the data and have my license managers working? What if I edit md1 out of the raidtab file?
Thanks, Jeff
On Wed, Mar 3, 2010 at 4:12 PM, Christopher Chan < christopher.chan@bradbury.edu.hk> wrote:
On Thursday, March 04, 2010 10:09 AM, John R Pierce wrote:
Jeff Sadino wrote:
Do you think I can copy the unknown partition from the second drive onto the first drive and have everything work again?
if there's a md mirror, god knows what will happen when you boot it up with both drives present, it oculd decide to mirror the unformatted partition to the original, I've seen worse happen when the juju is messed with.
Well, he messed up a stripe. Zero chance of recovery.
I think I'd totally wipe the drive you messed with, like zero it. Then boot up the system in single user with just the one drive you didn't hose, and BACK UP EVERYTHING ON IT TO EXTERNAL SAFE MEDIA. use dump or something to do the backup that does a proper inode level backup of each volume. then add the wiped/blank drive, and remirror it with the mdadm commands to repair the raid, then reboot to normal mode and see if its all safe.
Won't apply. He can kiss md1 good bye. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Jeff Sadino wrote:
Thanks for the insight. Is there any way to bring it back to life not necessarily as a raid, but just back up so I can get to the data and have my license managers working? What if I edit md1 out of the raidtab file?
The data on a raid0 stripes across both drives as though it were one big cylinder. You aren't going to make md1 work work with one missing. You should be able to recover what was on md0 (a raid1) from either of its members.
Ok, I'm learning a lot about raids and what to do, and what not to do. Looking at some info I had before, md1 was 200GB in size, which makes sense, but it was only 39GB full. The way I repartitioned drive 1, I probably overwrote only about 11GB. Does that make it any easier to recover any amount of the raid? Is there some sort of "recover lost partitions" option in Linux or gparted?
Thank you! Jeff
On Wed, Mar 3, 2010 at 4:55 PM, Les Mikesell lesmikesell@gmail.com wrote:
Jeff Sadino wrote:
Thanks for the insight. Is there any way to bring it back to life not necessarily as a raid, but just back up so I can get to the data and have my license managers working? What if I edit md1 out of the raidtab file?
The data on a raid0 stripes across both drives as though it were one big cylinder. You aren't going to make md1 work work with one missing. You should be able to recover what was on md0 (a raid1) from either of its members.
-- Les Mikesell lesmikesell@gmail.com _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Jeff Sadino wrote:
Ok, I'm learning a lot about raids and what to do, and what not to do. Looking at some info I had before, md1 was 200GB in size, which makes sense, but it was only 39GB full. The way I repartitioned drive 1, I probably overwrote only about 11GB. Does that make it any easier to recover any amount of the raid? Is there some sort of "recover lost partitions" option in Linux or gparted?
The partition is just a map, if you can re-create the partition exactly the way it was before, the data should still be there if it wasn't overwritten.
But as far as I know there isn't a backup stored of the partition table, if the disk is exactly the same as the other member, then you can try duplicating the partition setup using the first disk as a guide.
I don't know whether or not it will help restore a RAID 0 set, but may be worth a shot since the situation probably can't get much worse.
nate
On Thursday, March 04, 2010 11:33 AM, nate wrote:
Jeff Sadino wrote:
Ok, I'm learning a lot about raids and what to do, and what not to do. Looking at some info I had before, md1 was 200GB in size, which makes sense, but it was only 39GB full. The way I repartitioned drive 1, I probably overwrote only about 11GB. Does that make it any easier to recover any amount of the raid? Is there some sort of "recover lost partitions" option in Linux or gparted?
The partition is just a map, if you can re-create the partition exactly the way it was before, the data should still be there if it wasn't overwritten.
That problem was that he had it formatted as ext3...
But as far as I know there isn't a backup stored of the partition table, if the disk is exactly the same as the other member, then you can try duplicating the partition setup using the first disk as a guide.
+1
I don't know whether or not it will help restore a RAID 0 set, but may be worth a shot since the situation probably can't get much worse.
Just hope that the ext3 format only hit blocks contain non-essential data...and nothing related to filesystem structure and yada, yada
On Mar 3, 2010, at 10:24 PM, Jeff Sadino jsadino.queens@gmail.com wrote:
Ok, I'm learning a lot about raids and what to do, and what not to do. Looking at some info I had before, md1 was 200GB in size, which makes sense, but it was only 39GB full. The way I repartitioned drive 1, I probably overwrote only about 11GB. Does that make it any easier to recover any amount of the raid? Is there some sort of "recover lost partitions" option in Linux or gparted?
Don't you have backups of this data?
You can just re-create the raid0 and restore the data.
-Ross
Backups? I wish :) I will now. But looking closer, that md1 is not my OS partition, just a data partition. If I take that md1 entry out of the raidtab file and restart the computer, I would think that it would start up just fine, minus the data partition (and for the moment neglecting any vital programs that might be installed on that partition). My question is when I start the computer back up, in order to start up without that partition there any more, will the OS write any new files or anything that will not be reversible?
Thank you again, Jeff
On Wed, Mar 3, 2010 at 6:40 PM, Ross Walker rswwalker@gmail.com wrote:
On Mar 3, 2010, at 10:24 PM, Jeff Sadino jsadino.queens@gmail.com wrote:
Ok, I'm learning a lot about raids and what to do, and what not to do. Looking at some info I had before, md1 was 200GB in size, which makes sense, but it was only 39GB full. The way I repartitioned drive 1, I probably overwrote only about 11GB. Does that make it any easier to recover any amount of the raid? Is there some sort of "recover lost partitions" option in Linux or gparted?
Don't you have backups of this data?
You can just re-create the raid0 and restore the data.
-Ross
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Thursday, March 04, 2010 01:15 PM, Jeff Sadino wrote:
Backups? I wish :) I will now.
/me hands Jeff a big clueby4 to use on the former admin.
But looking closer, that md1 is not my OS partition, just a data partition. If I take that md1 entry out of the raidtab file and restart the computer, I would think that it would start up just fine, minus the data partition (and for the moment neglecting any vital programs that might be installed on that partition). My question is when I start the computer back up, in order to start up without that partition there any more, will the OS write any new files or anything that will not be reversible?
Yes...logs mainly. Most probably not reversible that lot. You may need to comment out the entry in /etc/fstab for md1 too. You might see messages from services tied to the data partition and if they get in the way of start up, just go into single-user mode and disable them...
Thank you again, Jeff
On Wed, Mar 3, 2010 at 6:40 PM, Ross Walker <rswwalker@gmail.com mailto:rswwalker@gmail.com> wrote:
On Mar 3, 2010, at 10:24 PM, Jeff Sadino <jsadino.queens@gmail.com <mailto:jsadino.queens@gmail.com>> wrote: > Ok, I'm learning a lot about raids and what to do, and what not to > do. Looking at some info I had before, md1 was 200GB in size, which > makes sense, but it was only 39GB full. The way I repartitioned > drive 1, I probably overwrote only about 11GB. Does that make it > any easier to recover any amount of the raid? Is there some sort of > "recover lost partitions" option in Linux or gparted? Don't you have backups of this data? You can just re-create the raid0 and restore the data. -Ross _______________________________________________ CentOS mailing list CentOS@centos.org <mailto:CentOS@centos.org> http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Jeff Sadino wrote:
Backups? I wish :) I will now. But looking closer, that md1 is not my OS partition, just a data partition. If I take that md1 entry out of the raidtab file and restart the computer, I would think that it would start up just fine, minus the data partition (and for the moment neglecting any vital programs that might be installed on that partition). My question is when I start the computer back up, in order to start up without that partition there any more, will the OS write any new files or anything that will not be reversible?
You'll need to take the mount point out of /etc/fstab to come up without it. I don't think it even matters that the raid assembly fails but not being able to mount everything in fstab is fatal.
From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Jeff Sadino Sent: Thursday, March 04, 2010 6:15 AM To: CentOS mailing list Subject: Re: [CentOS] Recover RAID
Backups? I wish :) I will now. But looking closer, that md1 is not my OS partition, just a data partition. If I take that md1 entry out of the raidtab file and restart the computer, I would think that it would start up just fine, minus the data partition (and for the moment neglecting any vital programs that might be installed on that partition). My question is when I start the computer back up, in order to start up without that partition there any more, will the OS write any new files or anything that will not be reversible? ----
Eh? Raid0 with no backups? For real?
raiddev /dev/md1 raid-level 0 nr-raid-disks 2 persistent-superblock 1 chunk-size 4 device /dev/sda4 raid-disk 0 device /dev/sdb2 raid-disk 1
It says it can bring up md0 ok, but not md1. Right now, I am going to try to restore the "unknown" partition that I deleted from drive 1 using the "unknown" partition from drive 2.
Any ideas on how to get myself out of this mess? I feel like I really messed it up good. This is a server for our work, and we have a couple years worth of data on it, so I would really like to fix it rather than reinstall.
Sorry, recovery is impossible at this point. md1 is a raid0 device. Since you have toasted one member partition, the whole thing it toast. I recommend clubbing whoever set it up in the first place with a clueby4 and then yourself.