[CentOS] 12-15 TB RAID storage recommendations

Tue Apr 13 19:25:36 UTC 2010
James A. Peltier <jpeltier at fas.sfu.ca>

On Tue, 13 Apr 2010, Boris Epstein wrote:

> Hello listmates,
>
> I would like to build a 12-15 TB RAID 5 data server to run under
> ContOS. Any recommendations as far as hardware, configuration, etc?
>
> Thanks.
>
> Boris.

Smaller volumes is best, but really it depends on your I/O type as well. 
I have 15TB volumes loaded with medical imaging data that happily run and 
fsck just fine.  We've had a couple of disk failures and the MD 3000 
and MD1000 units handled this just fine taking around 26 hours to sync 1TB 
drives.  The file system here is XFS

On the other hand, I have natural language data sets which are millions of 
small files residing on a 4.5TB EXT4 file system.  This file system has 
had a problem and to this day I still cannot perform a file system check 
to correct the errors because the e4fsck program chews up more than 42GB 
of memory and then dies.  For details check out.

https://bugzilla.redhat.com/show_bug.cgi?id=570639

What I'm trying to say is, understand your usage patterns.  Large 
streaming files is far less intensive on the controller then millions of 
small files.

Understand  your hardware and what it is capable of in each configuration. 
RAID-0 vs 5 vs 6 vs 10.  It is incredibly important.

Understand your file system.  Figure out what file system works best for 
your workload, how it functions and how the underlying hardware needs to 
be configured to maximize throughput.

That is all for now. ;)


-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
           http://blogs.sfu.ca/people/jpeltier
MSN     : subatomic_spam at hotmail.com

TEAMWORK
  There's power in numbers.  Learn to work together.