[CentOS] Question about optimal filesystem with many small files.

oooooooooooo ooooooooooooo

hhh735 at hotmail.com
Sat Jul 11 00:01:53 UTC 2009


> You mentioned that the data can be retrieved from somewhere else. Is
> some part of this filename a unique key? 

The real key is up to 1023 chracters long and it's unique, but I have to trim to 256 charactes, by this way is not unique unless I add the hash.

>Do you have to track this
> relationship anyway - or age/expire content? 

I have to track the long filename -> short file name realation ship. Age is not relevant here.

I'd try to arrange things
> so the most likely scenario would take the fewest operations. Perhaps a
> mix of hash+filename would give direct access 99+% of the time and you
> could move all copies of collisions to a different area. 

yes its a good idea, but at this point I don't want to add more complexity tomy app, and having a separate area for collisions would make it more complex.

>Then you could
> keep the database mapping the full name to the hashed path but you'd
> only have to consult it when the open() attempt fails.

As the long filename is up to 1023 chars long i can't index it with mysql (it has a lower max limit). That's why I use the hash which is indexed). What I do is keeping a list of just the md5 of teh cached files in memory in my app, before going to mysql, I frist check if it's in the list (realy a RB-Tree).



_________________________________________________________________
Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us


More information about the CentOS mailing list