On 7/14/2011 1:32 AM, John R Pierce wrote: > I've been asked for ideas on building a rather large archival storage > system for inhouse use, on the order of 100-400TB. Probably using CentOS > 6. The existing system this would replace is using Solaris 10 and > ZFS, but I want to explore using Linux instead. > > We have our own tomcat based archiving software that would run on this > storage server, along with NFS client and server. Its a write once, > read almost never kind of application, storing compressed batches of > archive files for a year or two. 400TB written over 2 years translates > to about 200TB/year or about 7MB/second average write speed. The very > rare and occasional read accesses are done by batches where a client > makes a webservice call to get a specific set of files, then they are > pushed as a batch to staging storage where the user can then browse > them, this can take minutes without any problems. If it doesn't have to look exactly like a file system you might like luwak which is a layer over the riak nosql distributed database to handle large files. (http://wiki.basho.com/Luwak.html) The underlying storage is distributed across any number of nodes with a scheme that lets you add more as needed and keeps redundant copies to handle node failures. A down side of luwak for most purposes is that because it chunks the data and re-uses duplicates, you can't remove anything, but for archive purposes it might work well. For something that looks more like a filesystem, but is also distributed and redundant: http://www.moosefs.org/. -- Les Mikesell lesmikesell at gmail.com