[CentOS] Question about optimal filesystem with many small files.

Thu Jul 9 17:09:31 UTC 2009
James A. Peltier <jpeltier at fas.sfu.ca>

On Thu, 9 Jul 2009, oooooooooooo ooooooooooooo wrote:

> It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory levels, each directory name would be a single character from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 200 files). Do you think that this would really improve performance? Could this structure be improved?

If you don't plan on modifying the file after creation I could see it 
working.  You could consider the use of a Berkley DB style database for 
quick and easy lookups on large amounts of data, but depending on your 
exact needs maintenance might be a chore and not really feasable.

It's an interesting suggestion but I don't know if it would actually work 
like you describe based on having to always compute the hash first.

James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
MSN     : subatomic_spam at hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.