[CentOS] Recommendations for a “real RAID" 1 card on Centos box

Mon Mar 10 19:20:09 UTC 2008
nate <centos at linuxpowered.net>

Robert Arkiletian wrote:
> On 3/10/08, nate <centos at linuxpowered.net> wrote:
>
>> You can turn on write back caching if you have a UPS as well
>>  (provided your UPS is wired into your system for a graceful shutdown)
>
> Hopefully you have a redundant PS unit. Having a UPS is not going to
> help if your PS fails.

That is true, buy high quality stuff up front for fewer problems down
the road. Not a sure bet, but a better one. In the half dozen systems
I've been running at home for the past several years none of them
have suffered a hardware failure of any kind(fortunately). I've been
running PC Power and Cooling power supplies for about 9 years now,
really high quality PSUs(last one I bought was about 4 years ago, can't
speak for their quality now).

I've had 15 power supplies fail across about 600 systems in the past
8 years at my various jobs. Probably 150+ disk failures during that
same time on the same systems. And maybe 3 RAID card failures(all
of which were caught before the system was put into use).

In 2005 when I purchased about 200 Cyclades managed power stripes I
had at least 10 of those fail, which was scary. Their QA at the time
was pretty poor(I toured their facility in early 2006), they claimed
I was one of only two customers that were having problems with their
power strips(and I had less than 100 of those PDUs in use at the
time so ~10% failure rate). They've since been bought by Advocent
and they're probably well on their way to outsourcing their manufacturing
which they said would improve quality since it would force them to
make better specs for testing and stuff.

So BBU is certainly a nice thing to have but at least in my
experience isn't absolutely critical.

Of course for absolutely critical things I don't use server-based
RAID anyways. Multiple redundant controllers, multiple redundant
paths(to both the disks and to the hosts), is the way to go(assuming
your application(s) aren't built to be able to run on something
like a distributed file system). I've seen that some of the
latest HP servers have dual ported SAS disks, which sounds pretty
neat. I assume they still only have one controller though.

My main storage array has a built in battery as well, it's pretty
cool in that if the power goes out, it keeps the controller operational
long enough to dump the contents of the cache(8GB) to an internal
IDE disk, then powers off. Eliminates the need for having to maintain
the battery during an extended(several day) outage. And of course
the cache is mirrored between two controller nodes, and no writes
are committed to disk before the write is processed by both nodes.
If one node fails the cache is disabled on the remaining node until
the other node recovers.

nate