On Thu, 9 Jul 2009, oooooooooooo ooooooooooooo wrote:
It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory levels, each directory name would be a single character from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 200 files). Do you think that this would really improve performance? Could this structure be improved?
If you don't plan on modifying the file after creation I could see it working. You could consider the use of a Berkley DB style database for quick and easy lookups on large amounts of data, but depending on your exact needs maintenance might be a chore and not really feasable.
It's an interesting suggestion but I don't know if it would actually work like you describe based on having to always compute the hash first.