[CentOS] Question about optimal filesystem with many small files.

Thu Jul 9 04:28:28 UTC 2009
nate <centos at linuxpowered.net>

James A. Peltier wrote:

> There isn't a good file system for this type of thing.  filesystems with
> many very small files are always slow.  Ext3, XFS, JFS are all terrible
> for this type of thing.

I can think of one...though you'll pay out the ass for it, the
Silicon file system from BlueArc (NFS), file system runs on
FPGAs. Our BlueArc's never had more than 50-100,000 files in any
particular directory(millions in any particular tree), though
they are supposed to be able to handle this sort of thing quite
well.

I think entry level list pricing starts at about $80-100k for
1 NAS gateway (no disks).

Our BlueArc's went end of life earlier this year and we migrated
to an Exanet cluster(runs on top of CentOS 4.4 though uses it's
own file system, clustering and NFS services) which is still
very fast though not as fast as BlueArc.

And with block based replication it doesn't matter how many
files there are, performance is excellent for backup, send
data to another rack in your data center or to another
continent over the WAN. In BlueArc's case transparently
send data to a dedupe device or tape drive based on
dynamic access patterns(and move it back automatically
when needed).

http://www.bluearc.com/html/products/file_system.shtml
http://www.exanet.com/default.asp?contentID=231

Both systems scale to gigabytes/second of throughput linearly,
and petabytes of storage without downtime. The only downside
to BlueArc is their back end storage, they only offer tier
2 storage and only have HDS for tier 1. You can make an HDS
perform but it'll cost you even more..The tier 2 stuff is
too unreliable(LSI logic). Exanet at least supports
almost any storage out there(we went with 3PAR).

Don't even try to get a netapp to do such a thing.

nate