[CentOS] 3ware disk failure -> hang -- how does software RAID "hide" a disk?

Fri Jan 6 21:26:02 UTC 2006
Mickael Maddison <centos at silverservers.com>

Hello Bryan,

Well said.

-- 
Best regards,
 Mickael
            mailto:mikelists at silverservers.com

Friday, January 6, 2006, 12:29:51 PM, you wrote:

> Joshua Baker-LePain <jlb17 at duke.edu> wrote:
>> But, as the archives of this list will attest to, using
> these
>> boards in hardware RAID mode in centos 4 is bad news.
>> Performance sucks.

> At RAID-5 writes?  Of course on the 7000/8000 designs.  They
> only have 1-4MB of SRAM, not enough to buffer SRAM.

> Furthermore, software RAID-0 is _always_ going to be faster
> than hardware RAID-0.  RAID-5 reads are basically RAID-0
> reads (minus one stripe).

> But at RAID-1 or RAID-10, 3Ware's 7000/8000 Storage Switch
> designs are very, very fast.

>> There's some sort of nasty interaction between the 3wares
> and
>> ext3 which makes the combo unusable, really.

> Huh?  _Never_ heard of that.  I'm using 7000/8000 series
> cards on RHEL3 and RHEL4 (as well as FC1-FC3), *0* issues. 
> All Ext3 filesystems.

>> Hotplug worked just fine on this system when I tested
>> (multiple times) via 'mdadm -f -r' and 'mdadm -a'.  It's
> the
>> actual disk failure handling that's at fault here.

> Yes, that's ... tada ... hotplug!

> You can't just have a fixed disk "remove itself" from the OS.
>  That's causing your panic.

> When you're using 3Ware in JBOD, all it can do is report the
> disk failure and report the fixed disk as unusable and remove
> it from the system.  So for software RAID, it's up to the
> _kernel_ to handle that right.

> And sure enough, it doesn't.

> Has absolutely nothing to do with 3Ware's card.  When you use
> JBOD and you remove or lose a disk, which is its own volume,
> the 3Ware removes the volume -- just as if a "regular" ATA or
> SCSI card with a disk.

> There is no way for 3Ware to "hide" the volume or continue
> using it -- because there is a 1:1 disc:volume relationship. 
> They only way to "hide" the disk is to use its hardware RAID
> features, where multiple disks are a volume.

> Until the kernel has standard, trusted features to handle
> failed disks, it's the reason why I refuse to use software
> RAID-1, 10 or 5.  Hotplug in 2.6 is supposed to handle this
> when setup correctly, but I've yet to see it.