[CentOS] Question about optimal filesystem with many small files.

oooooooooooo ooooooooooooo hhh735 at hotmail.com
Thu Jul 9 06:04:25 UTC 2009


>There's C code to do this in squid, and backuppc does it in perl (for a 
pool directory where all identical files are hardlinked).

Unfortunately I have to write the file with some predefined format, so these would not provide the flexibility I need.

>Rethink how you're writing files or you'll be in a world of hurt.

It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory levels, each directory name would be a single character from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 200 files). Do you think that this would really improve performance? Could this structure be improved?

>BTW, you can pretty much say goodbye to any backup solution for this type 
of project as well.  They'll all die dealing with a file system structure 
like this.

We don't plan to use backups (if the data gets corrupted, we can retrieve it again), but thanks for teh advice.

>I think entry level list pricing starts at about $80-100k for
1 NAS gateway (no disks).

That's far above the budget... 

>depending on the total size of this cache files, as it was suggested
by nate - throw some hardware at it.

Same that above, seems they don't want to spend more in HW  (so I have to deal with all performance issues...). Anyway if I can get all the directories to have around 200 files, I think I will be able to make this with the current hardware.

Thanks for the advice.

_________________________________________________________________
Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us


More information about the CentOS mailing list