[CentOS] clustered file system of choice

Simon Billis simon at houxou.com
Thu Jun 17 14:41:02 UTC 2010


Boris Epstein sent a missive on 2010-06-16:

> Hi all,
> 
> I am just trying to consider my options for storing a large mass of
> data (tens of terrabytes of files) and one idea is to build a
> clustered FS of some kind. Has anybody had any experience with that?
> Any recommendations?
> 
> Thanks in advance for any and all advice.

Take a look at hadoop http://hadoop.apache.org and specifically HDFS (hadoop
distributed file system) http://hadoop.apache.org/hdfs/ I've used it in
conjunction with nutch across 20 odd servers (circa 10TB). When I used it
the down side was a single metadata node, but this may have changed by now.
The data is stored redundantly across the nodes and doesn't seem to require
any special hardware (I ran it on dell 1425's).

HTH

Simon.







More information about the CentOS mailing list