[CentOS] Question about optimal filesystem with many small files.

Wed Jul 8 17:29:39 UTC 2009
Gary Greene <ggreene at minervanetworks.com>

On 7/8/09 8:56 AM, "Les Mikesell" <lesmikesell at gmail.com> wrote:
> oooooooooooo ooooooooooooo wrote:
>> Hi,
>> 
>> I have a program that writes lots of files to a directory tree (around 15
>> Million fo files), and a node can have up to 400000 files (and I don't have
>> any way to split this ammount in smaller ones). As the number of files grows,
>> my application gets slower and slower (the app is works something like a
>> cache for another app and I can't redesign the way it distributes files into
>> disk due to the other app requirements).
>> 
>> The filesystem I use is ext3 with teh following options enabled:
>> 
>> Filesystem features:      has_journal resize_inode dir_index filetype
>> needs_recovery sparse_super large_file
>> 
>> Is there any way to improve performance in ext3? Would you suggest another FS
>> for this situation (this is a prodution server, so I need a stable one) ?
>> 
>> Thanks in advance (and please excuse my bad english).
> 
> I haven't done, or even seen, any recent benchmarks but I'd expect
> reiserfs to still be the best at that sort of thing.   However even if
> you can improve things slightly, do not let whoever is responsible for
> that application ignore the fact that it is a horrible design that
> ignores a very well known problem that has easy solutions.  And don't
> ever do business with someone who would write a program like that again.
>   Any way you approach it, when you want to write a file the system must
> check to see if the name already exists, and if not, create it in an
> empty space that it must also find - and this must be done atomically so
> the directory must be locked against other concurrent operations until
> the update is complete.  If you don't index the contents the lookup is a
> slow linear scan - if you do, you then have to rewrite the index on
> every change so you can't win.  Sensible programs that expect to access
> a lot of files will build a tree structure to break up the number that
> land in any single directory (see squid for an example).  Even more
> sensible programs would re-use some existing caching mechanism like
> squid or memcached instead of writing a new one badly.

In many ways this is similar to issues you'll see in a very active mail or
news server that uses maildir wherein the d-entries get too large to be
traversed quickly. The only way to deal with it (especially if the
application adds and removes these files regularly) is to every once in a
while copy the files to another directory, nuke the directory and restore
from the copy. This is why databases are better for this kind of intensive
data caching.

-- 
Gary L. Greene, Jr.
IT Operations
Minerva Networks, Inc.
Cell:  (650) 704-6633
Phone: (408) 240-1239