[CentOS] Question about optimal filesystem with many small files.

Wed Jul 8 17:29:39 UTC 2009
Gary Greene <ggreene at minervanetworks.com>

On 7/8/09 8:56 AM, "Les Mikesell" <lesmikesell at gmail.com> wrote:
> oooooooooooo ooooooooooooo wrote:
>> Hi,
>> I have a program that writes lots of files to a directory tree (around 15
>> Million fo files), and a node can have up to 400000 files (and I don't have
>> any way to split this ammount in smaller ones). As the number of files grows,
>> my application gets slower and slower (the app is works something like a
>> cache for another app and I can't redesign the way it distributes files into
>> disk due to the other app requirements).
>> The filesystem I use is ext3 with teh following options enabled:
>> Filesystem features:      has_journal resize_inode dir_index filetype
>> needs_recovery sparse_super large_file
>> Is there any way to improve performance in ext3? Would you suggest another FS
>> for this situation (this is a prodution server, so I need a stable one) ?
>> Thanks in advance (and please excuse my bad english).
> I haven't done, or even seen, any recent benchmarks but I'd expect
> reiserfs to still be the best at that sort of thing.   However even if
> you can improve things slightly, do not let whoever is responsible for
> that application ignore the fact that it is a horrible design that
> ignores a very well known problem that has easy solutions.  And don't
> ever do business with someone who would write a program like that again.
>   Any way you approach it, when you want to write a file the system must
> check to see if the name already exists, and if not, create it in an
> empty space that it must also find - and this must be done atomically so
> the directory must be locked against other concurrent operations until
> the update is complete.  If you don't index the contents the lookup is a
> slow linear scan - if you do, you then have to rewrite the index on
> every change so you can't win.  Sensible programs that expect to access
> a lot of files will build a tree structure to break up the number that
> land in any single directory (see squid for an example).  Even more
> sensible programs would re-use some existing caching mechanism like
> squid or memcached instead of writing a new one badly.

In many ways this is similar to issues you'll see in a very active mail or
news server that uses maildir wherein the d-entries get too large to be
traversed quickly. The only way to deal with it (especially if the
application adds and removes these files regularly) is to every once in a
while copy the files to another directory, nuke the directory and restore
from the copy. This is why databases are better for this kind of intensive
data caching.

Gary L. Greene, Jr.
IT Operations
Minerva Networks, Inc.
Cell:  (650) 704-6633
Phone: (408) 240-1239