[CentOS] really large file systems with centos

Thu Jul 14 14:16:13 UTC 2011
Les Mikesell <lesmikesell at gmail.com>

On 7/14/2011 1:32 AM, John R Pierce wrote:
> I've been asked for ideas on building a rather large archival storage
> system for inhouse use, on the order of 100-400TB. Probably using CentOS
> 6.    The existing system this would replace is using Solaris 10 and
> ZFS, but I want to explore using Linux instead.
>
> We have our own tomcat based archiving software that would run on this
> storage server, along with NFS client and server.   Its a write once,
> read almost never kind of application, storing compressed batches of
> archive files for a year or two.   400TB written over 2 years translates
> to about 200TB/year or about 7MB/second average write speed.   The very
> rare and occasional read accesses are done by batches where a client
> makes a webservice call to get a specific set of files, then they are
> pushed as a batch to staging storage where the user can then browse
> them, this can take minutes without any problems.

If it doesn't have to look exactly like a file system you might like 
luwak which is a layer over the riak nosql distributed database to 
handle large files. (http://wiki.basho.com/Luwak.html) The underlying 
storage is distributed across any number of nodes with a scheme that 
lets you add more as needed and keeps redundant copies to handle node 
failures.  A down side of luwak for most purposes is that because it 
chunks the data and re-uses duplicates, you can't remove anything, but 
for archive purposes it might work well.

For something that looks more like a filesystem, but is also distributed 
and redundant: http://www.moosefs.org/.

-- 
   Les Mikesell
    lesmikesell at gmail.com