[CentOS] Question about optimal filesystem with many small files.

James A. Peltier jpeltier at fas.sfu.ca
Thu Jul 9 17:09:31 UTC 2009


On Thu, 9 Jul 2009, oooooooooooo ooooooooooooo wrote:

>
> It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory levels, each directory name would be a single character from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 200 files). Do you think that this would really improve performance? Could this structure be improved?
>

If you don't plan on modifying the file after creation I could see it 
working.  You could consider the use of a Berkley DB style database for 
quick and easy lookups on large amounts of data, but depending on your 
exact needs maintenance might be a chore and not really feasable.

It's an interesting suggestion but I don't know if it would actually work 
like you describe based on having to always compute the hash first.

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
           http://blogs.sfu.ca/people/jpeltier
MSN     : subatomic_spam at hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.



More information about the CentOS mailing list