On Tue, 13 Apr 2010, Boris Epstein wrote:
Hello listmates,
I would like to build a 12-15 TB RAID 5 data server to run under ContOS. Any recommendations as far as hardware, configuration, etc?
Thanks.
Boris.
Smaller volumes is best, but really it depends on your I/O type as well. I have 15TB volumes loaded with medical imaging data that happily run and fsck just fine. We've had a couple of disk failures and the MD 3000 and MD1000 units handled this just fine taking around 26 hours to sync 1TB drives. The file system here is XFS
On the other hand, I have natural language data sets which are millions of small files residing on a 4.5TB EXT4 file system. This file system has had a problem and to this day I still cannot perform a file system check to correct the errors because the e4fsck program chews up more than 42GB of memory and then dies. For details check out.
https://bugzilla.redhat.com/show_bug.cgi?id=570639
What I'm trying to say is, understand your usage patterns. Large streaming files is far less intensive on the controller then millions of small files.
Understand your hardware and what it is capable of in each configuration. RAID-0 vs 5 vs 6 vs 10. It is incredibly important.
Understand your file system. Figure out what file system works best for your workload, how it functions and how the underlying hardware needs to be configured to maximize throughput.
That is all for now. ;)