[CentOS] New controller card issues

Fri May 29 02:55:41 UTC 2015
Gordon Messmer <gordon.messmer at gmail.com>

On 05/28/2015 03:02 PM, Valeri Galtsev wrote:
>
> If I get you correctly you are saying that 3ware RAID cards are prone to
> hardware failures - as opposed to software RAID which is not (as it does
> not include hardware, so never has a hardware failures), right? No this is
> a joke of course. But it's the one one asked for ;-)

Software RAID uses whatever controller the disks are connected to. 
Often, that's the AHCI controller on the motherboard.  And those 
controllers are almost always more reliable than 3ware.  Seriously.  No 
joke.

> Again, I don't know your statistics: i.e. how many did you use,

Hundreds of systems, both with 3ware and with software RAID.

> how many
> of them died (failed as hardware).

Not many, but some.  A handful of data corruption cases (maybe 5?  I 
don't have logs).  One BBU failure that resulted in a system that 
wouldn't boot until it was removed (and a couple of hours down time 
while 3ware techs worked that ticket).  A couple of times when we really 
wanted to move an array of disks and couldn't.

Vs zero reliability issues with software RAID.

> How well surge-free the power is.

We always used trusted UPS hardware.

> How
> well the guys who installed your cards into your boxes followed static
> discharge precautions

Our systems were built by a professional VAR.  Employees always wore 
ground straps.  There were other anti-static measures in place as well. 
  I had been to the facility.

And I'll throw you a curveball:

I've argued that software RAID is more reliable than 3ware cards, 
specifically.  In the world of ZFS and btrfs, you absolutely should not 
use hardware RAID.  As you mentioned earlier, hardware RAID volumes 
should scan disks regularly.  However, scanning only tells you if disks 
have bad sectors.  It can also detect data (parity) errors, but it can't 
repair them.  Hardware RAID cards don't have information that can tell 
them which sectors are correct.  ZFS and btrfs, on the other hand, 
checksum their data and metadata.  They can tell which sectors are 
damaged in the case of corruption such as bit flips, and which sector 
should be used to repair the data.

Now, btrfs and ZFS are completely different from software RAID.  I'm not 
comparing the two.  But hardware RAID simply has too many deficiencies 
to justify its continued use.  It should go away.