[CentOS] Software RAID1 with CentOS-6.2
Emmett Culley
emmett at webengineer.com
Wed Feb 29 02:18:43 UTC 2012
On 02/28/2012 04:27 PM, Kahlil Hodgson wrote:
> Hello,
>
> Having a problem with software RAID that is driving me crazy.
>
> Here's the details:
>
> 1. CentOS 6.2 x86_64 install from the minimal iso (via pxeboot).
> 2. Reasonably good PC hardware (i.e. not budget, but not server grade either)
> with a pair of 1TB Western Digital SATA3 Drives.
> 3. Drives are plugged into the SATA3 ports on the mainboard (both drives and
> cables say they can do 6Gb/s).
> 4. During the install I set up software RAID1 for the two drives with two raid
> partitions:
> md0 - 500M for /boot
> md1 - "the rest" for a physical volume
> 5. Setup LVM on md1 in the standard slash, swap, home layout
>
> Install goes fine (actually really fast) and I reboot into CentoS 6.2. Next I
> ran yum update, added a few minor packages and performed some basic
> configuration.
>
> Now I start to get I/O errors on printed on the console. Run 'mdadm -D
> /dev/md1' and see the array is degraded and /dev/sdb2 has been marked as
> faulty.
>
> Okay, fair enough, I've got at least one bad drive. I boot the system from a
> live usb and run the short and long SMART tests on both drive. No problems
> reported but I know that can be misleading, so I'm going to have to gather some
> evidence before I try to return these drives. I run badblocks in destructive
> mode on both drives as follows
>
> badblocks -w -b 4096 -c 98304 -s /dev/sda
> badblocks -w -b 4096 -c 98304 -s /dev/sdb
>
> Come back the next day and see that no errors are reported. Er thats odd. I
> check the SMART data in case badblocks activity has triggered something.
> Nope. Maybe I screwed up the install somehow?
>
> So I start again and repeat the install process very carefully. This time I
> check the raid array straight after boot.
>
> mdadm -D /dev/md0 - all is fine.
> mdadm -D /dev/md1 - the two drives are resyncing.
>
> Okay, that is odd. The RAID1 array was created at the start of the install
> process, before any software was installed. Surely it should be in sync
> already? Googled a bit and found a post were someone else had seen same thing
> happen. The advice was to just wait until the drives sync so the 'blocks
> match exactly' but I'm not really happy with the explanation. At this rate
> its going to take a whole day to do a single minimal install and I'm sure I
> would have heard others complaining about the process.
>
> Anyway, I leave the system to sync for the rest of the day. When I get back to
> it I see the same (similar) I/O errors on the console and mdadm shows the RAID
> array is degraded, /dev/sdb2 has been marked as faulty. This time I notice
> that the I/O errors all refer to /dev/sda. Have to reboot because the fs is
> now readonly. When the system comes back up, its trying to resync the drive
> again. Eh?
>
> Any ideas what is going on here? If its bad drives, I really need some
> confirmation independent of the software raid failing. I thought SMART or
> badblocks give me that. Perhaps it has nothing to do with the drives. Could a
> problem with the mainboard or the memory cause this issue? Is it a SATA3
> issue? Should I try it on the 3Gb/s channels since there's probably little
> speed difference with non-SSDs?
>
> Cheers,
>
> Kal
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
I just had a very similar problem with a raid 10 array with four new 1TB drives. It turned out to be the SATA cable.
I first tried a new drive and even replaced the five disk hot plug carrier. It was always the same logical drive (/dev/sdb). I then tried using an additional SATA adapter card. That cinched it, as the only thing common to all the above was the SATA cable.
All has been well for a week now.
I should have tired replacing the cable first :-)
Emmett
More information about the CentOS
mailing list