[CentOS] using Linux as a NAS / SAN device

Fri Aug 28 19:26:30 UTC 2009
nate <centos at linuxpowered.net>

Rudi Ahlers wrote:

> But the one piece of of the puzzle that I don't understand, will a
> self-build-Linux NAS device, or even Openfiler / FreeNAS give us that
> kind of uptime.

You say that downtime is not an option, so I can say with
absolute confidence there really is nothing you can build for
the budget your looking for that will provide 100% uptime.

Either set expectations for the budget you have or get a bigger
budget to satisfy the requirements.

There are really only a few storage systems in the world that
will put money down on 100% SLA uptime and they  are all multi million
dollar systems, and even then they will just pay you for any downtime
caused by the storage, that doesn't mean there won't ever be
downtime. And one vendor at least - Hitachi claims they have yet to
have had to pay out on that guarantee(at least as of late last year
when I last talked to them).

Depending on space and performance requirements you can get
a system that's built for 99.999% uptime for about $90-120k in the
U.S.

Even my own new storage system which as configured lists for about
$990k does not guarantee 100% uptime, their goal is 99.999%, so far
we've had 100% uptime over the past year, we've had two soft
failures on the system, one was a Fiber channel HBA firmware crashed
and dumped, the system automatically restarted the HBA chip, the
second was a system level software component segfaulted(the system
runs on Debian), the system auto restarted it, no noticeable
impacts in either case as everything is connected to at least two
active-active controllers..

Providing high availability storage is not a simple task, take
for example a simple thing such as drive firmware upgrades, our
storage system had to undergo drive firmware upgrades this past
weekend due to a bug in the Seagate SATA drives which under very
rare conditions could cause data corruption. The array handled
the firmware upgrades itself, upgrading one drive at a time, took
about 16 hours for 200 disks, zero impact to the system.

If your building a system yourself in my experience its highly
unlikely that you are ever alerted to such a problem in the
drive firmware yet alone have to go through the process of
upgrading the drives. Fortunately critical drive firmware updates
are somewhat rare, but I think they will become more common
as more systems move to SATA, which for the most part is lower
quality/less testing.

One guy I met with a couple of years ago had an entirely SATA
drive system from another vendor using Western Digital drives,
and there was a NASTY firmware bug in that system as well, and
it continually impacted production, the drives at random times
would just flat out stall, and you had to physically remove them
from the array and re-insert them to cycle them and get them up
again. And the array vendor had no way of flashing drives
automatically at the time, he was faced with flashing each and
every drive individually in another system(s). Eventually the
vendor fixed their software to allow automatic firmware updates
but that's just another example of the complexities involved
with high availability storage and that's just at the block
storage level.

On some of our Dell servers we had to manually boot with a floppy
to DOS to flash some Seagate SCSI drive firmwares as the firmware
they shipped with killed performance(500% faster with newer
firmware for our app).

Then you need to take into account things like MPIO and active-active
or active-passive storage controllers. Then if you get into the
file based storage then there is another layer of availability
bolted on top of that as well which can further complicate things.

Our last NAS vendor is well known in the ultra high performance
arena, but even with an active-active NAS cluster they could not
do a major software upgrade without hard cluster downtime. And
fail over took upwards of 60 seconds.

> Ideally I would like have a highly-redundant storage device which can
> be used by numerous users, and also host Virtual Machines on it. So IO
> will be the biggest concern, in terms of speed, with reliability the
> 2nd biggest concern.

You say IO is the biggest concern yet below you plan to use SATA
disks?! Doesn't make sense. Unless you plan to have a large amount
of SATA disks. SATA has 1/2 the I/O capacity of 10k RPM, and 1/3rd
the I/O capacity of 15k RPM.

> The other question is, how well will my own Linux / UNIX based NAS
> perform? Surely these companies who build their own NAS devices spend
> a lot of time fine-tuning the OS to deliver the best performance, and
> probably spend a lot of time researching and testing different
> hardware devices and configurations to see what works best?

You sound like you want something that is fast, very highly
available, cheap, has lots of space, and easy to manage, such
a system doesn't really exist(depending on your view of how
cheap is cheap). The reason it doesn't exist is because it's
really complicated to get right.

Your setting yourself up for major disappointment or a massive
headache down the road. Pick a subset of your requirements, find
a solution that fits it and set expectations/SLAs to match that
solution whatever it may be.

If I were in your position I would opt for something that is
as simple to manage as possible, and limit the services you
provide through it, get decent hardware and setup some sort of
replication to a 2nd identical system(myself would avoid things
like DRBD)

At the very least opt for a good SCSI raid controller and
a shelf of external disks, don't use the internal drive bays
on a system. On the low end HP is a good fit, Infortrend
has some pretty good stuff, LSI as well.

If you want to go a bit higher end, get a fiber channel storage
system(same vendors) with redundant controllers and connect to
it via FC.

Make sure the drive models and firmware is certified with the
controller/storage system your getting.

nate