[CentOS] lots of small files in a folder on Linux centos
R P Herrold
herrold at centos.org
Sun Jul 24 21:50:11 UTC 2011
On Sun, 24 Jul 2011, Keith Roberts wrote:
>> By using a hash, we remove those constraints, and also gain
>> the virtuous effect for free of self-organizing a relatively
>> level dispersion of files to the destination directories
>
> Not followed the whole thread, but a SQL database index of
> the actual picture files, giving the path into the directory
> structure. Would that work?
Fortunately there is a full, and freely accessible of all
posts to this mailing list. The link to that archive is in
the header of every message through this list. As such you
need not speculate
As I read the post initially, the problem was as stated in the
subject line, and the database issue was not in the forefront
Per the initial problem description, the files were all
splatted into a single directory. The fastest database I know
of is using the filesystem as a database; The addition of the
hashing is just a pointer, and so also O(1)
Adding a database engine, with the overhead that it brings,
and as the thread has already pointed out, in a domU as well
(not usually the best place to add the overhead of a
database), simply are additonal points of mis-design
“We should forget about small efficiencies, say about 97% of
the time: premature optimization is the root of all evil. Yet
we should not pass up our opportunities in that critical 3%. A
good programmer will not be lulled into complacency by such
reasoning, he will be wise to look carefully at the critical
code; but only after that code has been identified”
- Donald Knuth [1]
Once the implementation is 'correct', then it is time to do
A:B testing to see where the really problem lies ... which
testing was at the head of my initial post on this topic
-- Russ herrold
[1] http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf
A person not willing to pony up $2.73 for a used copy of 'The
Art of Computer Programming: Sorting and Searching. Volume 3',
which discusses the specific problem space here, may wish to
read and consider his rather nice lecture published by the
ACM
More information about the CentOS
mailing list