[CentOS] wich filesystem to store > 250E6 small files in same or hashed dire

Mon Mar 14 20:10:36 UTC 2011
Dr. Ed Morbius <dredmorbius at gmail.com>

on 13:10 Sat 12 Mar, Alain Spineux (aspineux at gmail.com) wrote:
> Hi
> 
> I need to store about 250.000.000 files. Files are less than 4k.
> 
> On a ext4 (fedora 14)  the system crawl at 10.000.000 in the same directory.
> 
> I tried to create hash directories, two level of 4096 dir = 16.000.000
> but I had to stop the script to create these dir after hours
> and "rm -rf"  would have taken days ! mkfs was my friend
> 
> I tried two levels, first of 4096 dir, second of 64 dir. The creation
> of the hash dir took "only" few minutes,
> but copying 10000 files make my HD scream for 120s ! I take only 10s
> when working in the same directory.
> 
> The filenames are all 27 chars and the first chars can be used to hash
> the files.
> 
> My question is : Which filesystem and how to store these files ?

I'd also question the architecture and suggest an alternate approach:
hierarchical directory tree, database, "nosql" hashing lookup, or other
approach.  See squid for an example of using directory trees to handle
very large numbers of objects.   In fact, if you wired things up right,
you could probably use squid as a proxy back-end.

In general, I'd say a filesystem is the wrong approach to this problem.

What's the creation/deletion/update/lifecycle of these objects?  Are
they all created at once?  A few at a time?  Are they ever updated?  Are
they expired and/or deleted?

Otherwise, reiserfs and its hashed directory indexes scales well, though
I've only pushed it to about 125,000 entries in a single node.  There is
the usual comment about viability of a filesystem whose principle
architect is in jail on a murder charge.

It's possible XFS/JFS might also work.  I'd suggest you test building
and deleting large directories.

Incidentally, for testing, 'make -J' can be useful for parallelizing
processing, which would also test whether or not locking/contention on
the directory entry itself is going to be a bottleneck (I suspect it may
be).

You might also find that GNU 'find's "-depth" argument is useful for
deleting deep/large trees.

-- 
Dr. Ed Morbius, Chief Scientist /            |
  Robot Wrangler / Staff Psychologist        | When you seek unlimited power
Krell Power Systems Unlimited                |                  Go to Krell!