I've been asked for ideas on building a rather large archival storage
system for inhouse use, on the order of 100-400TB. Probably using CentOS
6. The existing system this would replace is using Solaris 10 and
ZFS, but I want to explore using Linux instead.
We have our own tomcat based archiving software that would run on this
storage server, along with NFS client and server. Its a write once,
read almost never kind of application, storing compressed batches of
archive files for a year or two. 400TB written over 2 years translates
to about 200TB/year or about 7MB/second average write speed. The very
rare and occasional read accesses are done by batches where a client
makes a webservice call to get a specific set of files, then they are
pushed as a batch to staging storage where the user can then browse
them, this can take minutes without any problems.
My general idea is a 2U server with 1-4 SAS cards connected to strings
of about 48 SATA disks (4 x 12 or 3 x 16), all configured as JBOD, so
there would potentially be 48 or 96 or 192 drives on this one server.
I'm thinking they should be laid as as 4 or 8 or 16 seperate RAID6 sets
of 10 disks each, then use LVM to put those into a larger volume.
About 10% of the disks would be reserved as global hot spares.
So, my questions...
A) Can CentOS 6 handle that many JBOD disks in one system? is my upper
size too big and I should plan for 2 or more servers? What happens with
the device names when you've gone past /dev/sdz ?
B) What is the status of large file system support in CentOS 6? I know XFS is frequently mentioned with such systems, but I/we have zero experience with it, its never been natively supported in EL up to 5, anyways.
C) Is GFS suitable for this, or is it strictly for clustered storage systems?
D) anything important I've neglected?
--
john r pierce N 37, W 122
santa cruz ca mid-left coast