CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR - Discuss

List overview All Threads
Download

newer

CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

older

Recent kernel update vs usb disk

kickstart partitioning and...

Jonathan Vomacka

2 Mar 2012 2 Mar '12

2:03 p.m.

CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

Is this a way to salvage this with a liveCD without having to reload the OS? This server is a very important mail server running OpenLDAP and MySQL. I figured maybe I could install the MBR using a liveCD which may fix the system.

If an OS reload is the ONLY option, is there a way to reload it without touching the /var or /opt filesystems? (yes they were created as a seperate partition) however I am not sure if OpenLDAP or MySQL installs anything to /usr which I would be completely screwed...

Please help

Show replies by date

Jonathan Vomacka

2 Mar 2 Mar

2:27 p.m.

Essentially I guess the short question is, How do I make a non bootable drive bootable if the original MBR is no longer available?

On 3/2/2012 9:03 AM, Jonathan Vomacka wrote:

...

CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

Is this a way to salvage this with a liveCD without having to reload the OS? This server is a very important mail server running OpenLDAP and MySQL. I figured maybe I could install the MBR using a liveCD which may fix the system.

If an OS reload is the ONLY option, is there a way to reload it without touching the /var or /opt filesystems? (yes they were created as a seperate partition) however I am not sure if OpenLDAP or MySQL installs anything to /usr which I would be completely screwed...

Please help

John Doe

5:02 p.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

From: Jonathan Vomacka juvix88@gmail.com

...

Essentially I guess the short question is, How do I make a non bootable drive bootable if the original MBR is no longer available?

First, ask the guys at your datacenter what's the benefit of RAID if you have to reinstall from scratch after a drive failure... Second, that's really bad luck that the one drive that failed out of the four was the one booting... Now, I never had to do any recovery yet but, while you are waiting for some experts answers, I think it would look something like: Boot in rescue mode. Make a backup if you do not have one already. chroot /mnt/sysimage/ # grub grub> root (hd0,0) grub> setup (hd0) grub> quit reboot Good luck! It is friday!

Les Mikesell

7 Mar 7 Mar

1:43 p.m.

On Fri, Mar 2, 2012 at 8:27 AM, Jonathan Vomacka juvix88@gmail.com wrote:

...

Essentially I guess the short question is, How do I make a non bootable drive bootable if the original MBR is no longer available?

If you haven't gotten this to work yet, it is probably as simple as booting the Centos install disk in rescue mode and installing grub on the disk that is now the boot disk after the primary has failed. That is, you likely do have a raid1 /boot but either no grub on it or a misconfigured copy - or the bios isn't automatically failing to it. But, you'll also need to replace the failed disk, partition it to match, resync the raids and re-install grub on the new disk. You might want to practice that on a non-critical machine or a VM before trying it where you might lose data.

-- Les Mikesell lesmikesell@gmail.com

m.roth＠5-cent.us

2 Mar 2 Mar

2:34 p.m.

Jonathan Vomacka wrote:

...

CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

<snip> Whatever the outcome, I would strongly recommend escalating this to a manager, since the staff installed it incorrectly - perhaps they don't understand what a RAID is? But they need to make this right - you're paying for it.

Besides, it's not Windows....

mark

Jonathan Vomacka

3 Mar 3 Mar

5:35 a.m.

On 3/2/2012 9:34 AM, m.roth@5-cent.us wrote:

...

Jonathan Vomacka wrote:

...
CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

<snip> Whatever the outcome, I would strongly recommend escalating this to a manager, since the staff installed it incorrectly - perhaps they don't understand what a RAID is? But they need to make this right - you're paying for it.

Besides, it's not Windows....
      mark
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Mark Roth,

I escalated to the DC manager and this is what he replied:

"I'm sorry your having a hard time with software raid on your server and our install process. From what I remember talking with out techs long ago about this is, that when using raid10 and software raid, the bootloader cannot be installed on the software raid partition and has to be on a single drive. I am not 100% sure on this, and will confirm with my tech later tonight and to see what can be done to correct your issue."

Robert Spangler

6:34 a.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

On Saturday 03 March 2012 00:35, the following was written:

...

I escalated to the DC manager and this is what he replied:

"I'm sorry your having a hard time with software raid on your server and our install process. From what I remember talking with out techs long ago about this is, that when using raid10 and software raid, the bootloader cannot be installed on the software raid partition and has to be on a single drive. I am not 100% sure on this, and will confirm with my tech later tonight and to see what can be done to correct your issue."

Do not let them tell you that you cannot boot from a software raid. I do it here all the time. The /boot has to be on a raid1 setup to boot. Everything else can be on a whatever raid you choose.

Bottom line is if they caused you downtime then you should be compensated for it.

-- Regards Robert Linux The adventure of a lifetime. Linux User #296285 Get Counted http://linuxcounter.net/

Les Mikesell

5:49 p.m.

On Sat, Mar 3, 2012 at 12:34 AM, Robert Spangler mlists@zoominternet.net wrote:

...

...
Do not let them tell you that you cannot boot from a software raid. I do it here all the time. The /boot has to be on a raid1 setup to boot. Everything else can be on a whatever raid you choose.

You don't actually boot from a raid1, you boot from one of the mirrored partitions that happens to look enough like a normal non-raid partition to work. And it is up to the bios on the machine to try the 2nd copy if the 1st drive fails, and grub has to be installed on the 2nd drive and configured to identify the drive the same way bios will after the failure (which I don't think is always the same and may even depend on the type of failure).

-- Les Mikesell lesmikesell@gmail.com

Digimer

2 Mar 2 Mar

5:07 p.m.

On 03/02/2012 09:03 AM, Jonathan Vomacka wrote:

...

CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

Is this a way to salvage this with a liveCD without having to reload the OS? This server is a very important mail server running OpenLDAP and MySQL. I figured maybe I could install the MBR using a liveCD which may fix the system.

If an OS reload is the ONLY option, is there a way to reload it without touching the /var or /opt filesystems? (yes they were created as a seperate partition) however I am not sure if OpenLDAP or MySQL installs anything to /usr which I would be completely screwed...

Please help

Advice provided as-is.

Boot from a live CD using the CentOS 6.2 install media. Once booted:

<bash># grub <grub> root (hd0,0) <grub> setup (hd0) <grub> root (hd1,0) <grub> setup (hd1) <grub> root (hd2,0) <grub> setup (hd2) <grub> quit <bash># reboot

This assumes that grub sees the drives at '0, 1 and 2' and the boot partition is the first on each drive. If it is, when you type 'root (hdX,0)' it should report that a file system was found. The 'setup (hdX)' will tell grub to write the MBR to the specified disk.

-- Digimer E-Mail: digimer@alteeve.com Papers and Projects: https://alteeve.com

m.roth＠5-cent.us

6:01 p.m.

Digimer wrote: <snip>

...

Boot from a live CD using the CentOS 6.2 install media. Once booted:

<bash># grub <grub> root (hd0,0) <grub> setup (hd0) <grub> root (hd1,0) <grub> setup (hd1) <grub> root (hd2,0) <grub> setup (hd2) <grub> quit <bash># reboot

This assumes that grub sees the drives at '0, 1 and 2' and the boot partition is the first on each drive. If it is, when you type 'root (hdX,0)' it should report that a file system was found. The 'setup (hdX)' will tell grub to write the MBR to the specified disk.

THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

mark

Digimer

7:39 p.m.

On 03/02/2012 01:01 PM, m.roth@5-cent.us wrote:

...

Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk.

THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.
    mark

I've run into this a few times now where RAID'ed /boot doesn't have the MBR written to all members. I've now gotten into the habit of running this after OS install. I should file a bug with upstream...

-- Digimer E-Mail: digimer@alteeve.com Papers and Projects: https://alteeve.com

Digimer

7:48 p.m.

On 03/02/2012 02:39 PM, Digimer wrote:

...

On 03/02/2012 01:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk.

THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.
    mark
I've run into this a few times now where RAID'ed /boot doesn't have the MBR written to all members. I've now gotten into the habit of running this after OS install. I should file a bug with upstream...

https://bugzilla.redhat.com/show_bug.cgi?id=799501

-- Digimer E-Mail: digimer@alteeve.com Papers and Projects: https://alteeve.com

Bowie Bailey

7:46 p.m.

On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...

Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk. THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

When I set up a RAID 1, I do it like this:

device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd0) /dev/sdb root (hd0,0) setup (hd0) device (hd0) /dev/sdc root (hd0,0) setup (hd0)

This way, all the drives are set up as if they are hd0. This way, any of them will boot normally as a stand-alone drive.

-- Bowie

Jonathan Vomacka

9 p.m.

On 3/2/2012 2:46 PM, Bowie Bailey wrote:

...

On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk. THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

When I set up a RAID 1, I do it like this:

device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd0) /dev/sdb root (hd0,0) setup (hd0) device (hd0) /dev/sdc root (hd0,0) setup (hd0)

This way, all the drives are set up as if they are hd0. This way, any of them will boot normally as a stand-alone drive.

Bowie, in terms of RAID 10, each drive technically cant be standalone right? The drives are striped and mirrored.

Digimer

9:18 p.m.

On 03/02/2012 04:00 PM, Jonathan Vomacka wrote:

...

On 3/2/2012 2:46 PM, Bowie Bailey wrote:

...
On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk. THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

When I set up a RAID 1, I do it like this:

device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd0) /dev/sdb root (hd0,0) setup (hd0) device (hd0) /dev/sdc root (hd0,0) setup (hd0)

This way, all the drives are set up as if they are hd0. This way, any of them will boot normally as a stand-alone drive.

Bowie, in terms of RAID 10, each drive technically cant be standalone right? The drives are striped and mirrored.

10 (one-zero) == a mirror of two stripped arrays. You can lose up to two drives, so long as they are both from the same strip set.

As I understand it though, I thought that /boot could only exist on RAID 1 vanilla.

-- Digimer E-Mail: digimer@alteeve.com Papers and Projects: https://alteeve.com

Lamar Owen

5 Mar 5 Mar

2:47 p.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

On Friday, March 02, 2012 04:18:48 PM Digimer wrote:

...

10 (one-zero) == a mirror of two stripped arrays. You can lose up to two drives, so long as they are both from the same strip set.

You really don't want a mirror of two striped arrays; you want a striped array of mirrors.

A striped array of mirrors is much more resilient to multiple drive failures than a mirrored set of striped arrays.

See http://www.linux-mag.com/id/7928/?hq_e=el&hq_m=1151565&hq_l=36&h...

What you're describing is RAID01, not RAID10.

Markus Falb

6 Mar 6 Mar

4:12 p.m.

On 2.3.2012 22:18, Digimer wrote:

...

On 03/02/2012 04:00 PM, Jonathan Vomacka wrote:

...
On 3/2/2012 2:46 PM, Bowie Bailey wrote:

...
On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

...

...
Bowie, in terms of RAID 10, each drive technically cant be standalone right? The drives are striped and mirrored.

10 (one-zero) == a mirror of two stripped arrays. You can lose up to two drives, so long as they are both from the same strip set.

10 is a stripe of two mirrors, but anyway, there is more to it because mdraid 10 is another beast. Unfortunate naming!

http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10

So one has to ask everytime what is meant, a standard layered raid 10 or a mdraid 10.

...

As I understand it though, I thought that /boot could only exist on RAID 1 vanilla.

I think this is correct. Because of this I also think that the OP has bigger problems than the just the MBR. Where is his /boot gone?

-- Kind Regards, Markus Falb

Bowie Bailey

2 Mar 2 Mar

9:20 p.m.

On 3/2/2012 4:00 PM, Jonathan Vomacka wrote:

...

On 3/2/2012 2:46 PM, Bowie Bailey wrote:

...
On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk. THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

When I set up a RAID 1, I do it like this:

device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd0) /dev/sdb root (hd0,0) setup (hd0) device (hd0) /dev/sdc root (hd0,0) setup (hd0)

This way, all the drives are set up as if they are hd0. This way, any of them will boot normally as a stand-alone drive.

Bowie, in terms of RAID 10, each drive technically cant be standalone right? The drives are striped and mirrored.

Right. I was referring to RAID 1. For a RAID 10, you would have to find the proper drive to boot from. This is why I tend to limit myself to RAID 1 in software. If I need something more complex than that, I get a hardware card so the OS just sees it as a single drive and you don't have to worry about grub.

-- Bowie

Digimer

9:25 p.m.

On 03/02/2012 04:20 PM, Bowie Bailey wrote:

...

On 3/2/2012 4:00 PM, Jonathan Vomacka wrote:

...
On 3/2/2012 2:46 PM, Bowie Bailey wrote:

...
On 3/2/2012 1:01 PM, m.roth@5-cent.us wrote:

...
Digimer wrote:

<snip> > Boot from a live CD using the CentOS 6.2 install media. Once booted: > > <bash># grub > <grub> root (hd0,0) > <grub> setup (hd0) > <grub> root (hd1,0) > <grub> setup (hd1) > <grub> root (hd2,0) > <grub> setup (hd2) > <grub> quit > <bash># reboot > > This assumes that grub sees the drives at '0, 1 and 2' and the boot > partition is the first on each drive. If it is, when you type 'root > (hdX,0)' it should report that a file system was found. The 'setup > (hdX)' will tell grub to write the MBR to the specified disk. THANK YOU! I could have used that once or twice, and had no idea that grub could create a std. MBR.

When I set up a RAID 1, I do it like this:

device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd0) /dev/sdb root (hd0,0) setup (hd0) device (hd0) /dev/sdc root (hd0,0) setup (hd0)

This way, all the drives are set up as if they are hd0. This way, any of them will boot normally as a stand-alone drive.

Bowie, in terms of RAID 10, each drive technically cant be standalone right? The drives are striped and mirrored.

Right. I was referring to RAID 1. For a RAID 10, you would have to find the proper drive to boot from. This is why I tend to limit myself to RAID 1 in software. If I need something more complex than that, I get a hardware card so the OS just sees it as a single drive and you don't have to worry about grub.

Ya. When I do use four drives, I simply create a four-member RAID level 1 array... Keeps it simple and, frankly, any performance gain from other (mixed) levels would be easily missed.

-- Digimer E-Mail: digimer@alteeve.com Papers and Projects: https://alteeve.com

Luke S. Crawford

5 Mar 5 Mar

1:01 a.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

...

Right. I was referring to RAID 1. For a RAID 10, you would have to find the proper drive to boot from. This is why I tend to limit myself to RAID 1 in software. If I need something more complex than that, I get a hardware card so the OS just sees it as a single drive and you don't have to worry about grub.

When I want to boot off of a raid 10, I first partition the drives and make a small (like a gigabyte) partition 1, and put the rest of the space on partition 2. I do this on all drives, then I create a raid 1 of sd[abcd]1 for /, and a raid10 of sd[abcd]2 for everything else. I've got this config on several tens of servers and it seems to work okay.

Jonathan Vomacka

1:42 p.m.

On 3/4/2012 8:01 PM, Luke S. Crawford wrote:

...

...
Right. I was referring to RAID 1. For a RAID 10, you would have to find the proper drive to boot from. This is why I tend to limit myself to RAID 1 in software. If I need something more complex than that, I get a hardware card so the OS just sees it as a single drive and you don't have to worry about grub.

When I want to boot off of a raid 10, I first partition the drives and make a small (like a gigabyte) partition 1, and put the rest of the space on partition 2. I do this on all drives, then I create a raid 1 of sd[abcd]1 for /, and a raid10 of sd[abcd]2 for everything else. I've got this config on several tens of servers and it seems to work okay.

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Thanks Luke. Lucky to find you here. I am still waiting on the monthly VPS packages you know!

Ross Walker

11:12 p.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

On Mar 4, 2012, at 8:01 PM, "Luke S. Crawford" lsc@prgmr.com wrote:

...

...
Right. I was referring to RAID 1. For a RAID 10, you would have to find the proper drive to boot from. This is why I tend to limit myself to RAID 1 in software. If I need something more complex than that, I get a hardware card so the OS just sees it as a single drive and you don't have to worry about grub.

When I want to boot off of a raid 10, I first partition the drives and make a small (like a gigabyte) partition 1, and put the rest of the space on partition 2. I do this on all drives, then I create a raid 1 of sd[abcd]1 for /, and a raid10 of sd[abcd]2 for everything else. I've got this config on several tens of servers and it seems to work okay.

Technically if the data portion is a true RAID10 you would only need to mirror /boot to sdb, cause if both sda AND sdb are out then the whole RAID10 is SOL and there would be no need to boot off of sdc or sdd.

Having said that though it's just easier to create a 4 disk raid1 of /boot and duplicate the MBR across all of them.

That should be standard practice for ALL software RAID setups as it allows initial boot from any device in the set. Then with the data portion of the drive it can be RAID10, RAID5 etc.

Sounds like the hosting provider isn't very Linux savvy. I would always double check the setup of any system someone else installs for you.

-Ross

Luke S. Crawford

6 Mar 6 Mar

3:52 a.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

On Mon, Mar 05, 2012 at 06:12:52PM -0500, Ross Walker wrote:

...

Technically if the data portion is a true RAID10 you would only need to mirror /boot to sdb, cause if both sda AND sdb are out then the whole RAID10 is SOL and there would be no need to boot off of sdc or sdd.

...

Having said that though it's just easier to create a 4 disk raid1 of /boot and duplicate the MBR across all of them.

I'm using the linux 'raid10' md type. It actually allows an odd number of disks; it just guarantees that each chunk of data is on at least two drives;

So yeah, rather than trying to guess on which chunk is where, I think a mirrored /boot is the easy way out.

I did used to create mirror sets and stripe across them using LVM stripeing. (Before that I actually used LVM mirroring, but it seems that nobody uses LVM mirrororing)

Anyhow, my ancidotal experience is that the linux md 'raid10' option results in an array that rebuilds like twice as fast as two mirrors that you stripe across with LVM.

But yeah, either way, md0 was a small mirror across all drives. (and remember to load the bootloader on all drives. It's in my kickstart so I can't forget on setup, but I still need to be careful when I replace drives.)

...

Sounds like the hosting provider isn't very Linux savvy. I would always double check the setup of any system someone else installs for you.

As a general rule, if you want your hosting provider to support more than just the hardware, you have to setup the software their way. I mean, I'm guessing that you asked this hosting provider 'Hey, can you setup software raid for me?' and they did, even though they don't usually. I mean, if they setup software raid usually and still haven't solved this, then they are just plain incompitent; I'm just saying, if this is an unusual setup for them, it might be the 'I haven't done this before' kind of incompitence, which we all suffer from time to time.

Jonathan Vomacka

2 Mar 2 Mar

8:14 p.m.

Digimer! Thanks for the info. Since the HDD0 drive is completely failed, I would need to replace it.. it doesn't have any data on it. The other three HDD's would need the MBR. I am assuming... that because RAID 10 means Striped+Mirroring that HDD 3+4 would be mirrored and 1+2 would be mirrored... is this an accurate statement?

On 3/2/2012 12:07 PM, Digimer wrote:

...

On 03/02/2012 09:03 AM, Jonathan Vomacka wrote:

...
CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

I don't understand why a reinstall would be necessary when the drives are in RAID 10. Apparently when the datacenter did the initial OS install, they ONLY installed the MBR on one drive instead of all 4 leaving the other 3 drives unbootable.

Is this a way to salvage this with a liveCD without having to reload the OS? This server is a very important mail server running OpenLDAP and MySQL. I figured maybe I could install the MBR using a liveCD which may fix the system.

If an OS reload is the ONLY option, is there a way to reload it without touching the /var or /opt filesystems? (yes they were created as a seperate partition) however I am not sure if OpenLDAP or MySQL installs anything to /usr which I would be completely screwed...

Please help

Advice provided as-is.

Boot from a live CD using the CentOS 6.2 install media. Once booted:

<bash># grub <grub> root (hd0,0) <grub> setup (hd0) <grub> root (hd1,0) <grub> setup (hd1) <grub> root (hd2,0) <grub> setup (hd2) <grub> quit <bash># reboot

This assumes that grub sees the drives at '0, 1 and 2' and the boot partition is the first on each drive. If it is, when you type 'root (hdX,0)' it should report that a file system was found. The 'setup (hdX)' will tell grub to write the MBR to the specified disk.

Devin Reade

8:09 p.m.

Putting an MBR on all disks right after an OS install, as previously mentioned, is of course the best option (although it's too late for that in this instance). Others have talked about using the live CD to recover from your situation, which is good.

Other, "less good" options that might be available to you are: - change the boot device in the BIOS - if the server isn't capable of doing that, have them flip the cables.

I, also, would recommend having a little chat with their manager ...

Devin

Jonathan Vomacka

3 Mar 3 Mar

5:47 a.m.

On 3/2/2012 3:09 PM, Devin Reade wrote:

...

Putting an MBR on all disks right after an OS install, as previously mentioned, is of course the best option (although it's too late for that in this instance). Others have talked about using the live CD to recover from your situation, which is good.

Other, "less good" options that might be available to you are:

change the boot device in the BIOS

if the server isn't capable of doing that, have them flip the cables.

I, also, would recommend having a little chat with their manager ...

Devin

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Devin,

In terms of the installing the MBR after the OS install, is this the responsibility of me, or the datacenter? I would assume that I would have to copy the MBR if the install already completed..

Also, would I need KVM access to do this or can it be done through SSH? My DC charges a rental for KVM.

Devin Reade

6:57 a.m.

New subject: CentOS 6.2 software raid 10 with LVM - Need help with degraded drive and only one MBR

Jonathan Vomacka juvix88@gmail.com wrote:

...

On 3/2/2012 3:09 PM, Devin Reade wrote:

...
Putting an MBR on all disks right after an OS install, as previously mentioned, is of course the best option (although it's too late for that in this instance).

In terms of the installing the MBR after the OS install, is this the responsibility of me, or the datacenter?

Whose *responsibility* it is is between you and your provider and depends on your SLAs, etc. It requires root access to perform.

...

Also, would I need KVM access to do this or can it be done through SSH? My DC charges a rental for KVM.

You can do it as root via ssh while your server is fully running; it does not need single user mode nor console access. The following is an extract from the changelog of one of my machines that has four disks, each of which has a 200Mb partition 1 (for /boot, mirrored), with the rest of each of the disks (partition 2) put into a four volume RAID6 set (which is used by LVM for the remaining filesystems).

- made sure we have boot blocks on both disks, based on information at http://grub.enbug.org/MirroringRAID modified /boot/grub/device.map from: (hd1) /dev/sda (hd0) /dev/sdb (hd2) /dev/sdc (hd3) /dev/sdd to: (hd0) /dev/sda (hd0) /dev/sdb (hd0) /dev/sdc (hd0) /dev/sdd and then: # grub grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0) grub> quit then repeated it for sdc and sdd, and finally again for sda just to be paranoid

Keep in mind that those disks already have the OS installed on them; writing the MBR like that doesn't affect the existing filesystems.

I use this kind of configuration in quite a few machines (sometimes with only two disks using mirroring on both partitions of each disk) and have verified that it works both in testing and the hard way ... having lost my nominal boot disk due to errors and recovered by powering down (these aren't hot-swap disks), replacing the bad drive, booting in degraded mode, partitioning the new disk, and then adding its partitions back into the applicable RAID sets. Ignoring the actual sync time, that's less than 20 minutes work, most of which is the disk swap. (Make sure you know which disk is the faulty one before you power down. I always ensure that there are markings on the drive that correlate to identifiers used by Linux before putting drives into production; often the manufacturer's serial number or WWID is sufficient.)

...

"I'm sorry your having a hard time with software raid on your server and our install process. From what I remember talking with out techs long ago about this is, that when using raid10 and software raid, the bootloader cannot be installed on the software raid partition and has to be on a single drive.

As Robert already observed, that's so much bullshit unless the hardware they're providing is so old (like probably > 5 years) that the BIOS can't be told to boot from alternate disks, in sequence. Even then, it's not insurmountable given they can change the boot device or revert to flipping cables even on the oldest hardware.

Give that manager the instructions above and suggest that he add it to his provisioning procedure.

Devin

Tru Huynh

2 Mar 2 Mar

8:38 p.m.

On Fri, Mar 02, 2012 at 09:03:44AM -0500, Jonathan Vomacka wrote:

...

CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

how was your disks partitions? and how was the lvm setup? was /boot made of a mdadm raid1 array?

there is a dracut issue on raid1 array (fixed by now, if you updated to the latest dracut release and rebuild your initramfs): http://bugs.centos.org/view.php?id=5400

Btw the anaconda installer does install grub on the MBR of both raid1 members. No idea if it does so for lvm over raid10 array.

Tru

-- Tru Huynh (mirrors, CentOS i386/x86_64 Package Maintenance) http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xBEFA581B

Jonathan Vomacka

3 Mar 3 Mar

5:48 a.m.

On 3/2/2012 3:38 PM, Tru Huynh wrote:

...

On Fri, Mar 02, 2012 at 09:03:44AM -0500, Jonathan Vomacka wrote:

...
CentOS Community,

I have a dedicated server with 4 hard drives in a RAID 10 software configuration running LVM. My OS is CentOS 6.2. Earlier today, I rebooted my system and my system did not come back online. I opened a ticket with my datacenter who informed me that one of my hard drives is no longer recognized by the bios and has failed. I was told that an OS reinstall was needed.

how was your disks partitions? and how was the lvm setup? was /boot made of a mdadm raid1 array?

there is a dracut issue on raid1 array (fixed by now, if you updated to the latest dracut release and rebuild your initramfs): http://bugs.centos.org/view.php?id=5400

Btw the anaconda installer does install grub on the MBR of both raid1 members. No idea if it does so for lvm over raid10 array.

Tru

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Tru,

This was a raid 10, would the same apply? If I do a fresh reinstall how can I guarantee that the MBR is across all drives?

Ljubomir Ljubojevic

7 Mar 7 Mar

11:12 a.m.

On 03/03/2012 06:48 AM, Jonathan Vomacka wrote:

...

This was a raid 10, would the same apply? If I do a fresh reinstall how can I guarantee that the MBR is across all drives?

I do not think you NEED to reinstall entire system.

Check if there is /boot partition on the rest of the disks. If there is, they might have (even old) kernel and the rest of files, so your system could be booted.

If there is no files on /boot partition, you could copy it from some other CentOS 6.2 system and edit the config files so the paths are correct. This will keep your system intact with all the custom config changes.

-- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe Google is the Mother, Google is the Father, and traceroute is your trusty Spiderman... StarOS, Mikrotik and CentOS/RHEL/Linux consultant

4975

Age (days ago)

4980

Last active (days ago)

discuss@lists.centos.org

29 comments

14 participants

tags (0)

participants (14)

Bowie Bailey
Devin Reade
Digimer
John Doe
Jonathan Vomacka
Lamar Owen
Les Mikesell
Ljubomir Ljubojevic
Luke S. Crawford
m.roth＠5-cent.us
Markus Falb
Robert Spangler
Ross Walker
Tru Huynh