on 13:10 Sat 12 Mar, Alain Spineux (aspineux at gmail.com) wrote: > Hi > > I need to store about 250.000.000 files. Files are less than 4k. > > On a ext4 (fedora 14) the system crawl at 10.000.000 in the same directory. > > I tried to create hash directories, two level of 4096 dir = 16.000.000 > but I had to stop the script to create these dir after hours > and "rm -rf" would have taken days ! mkfs was my friend > > I tried two levels, first of 4096 dir, second of 64 dir. The creation > of the hash dir took "only" few minutes, > but copying 10000 files make my HD scream for 120s ! I take only 10s > when working in the same directory. > > The filenames are all 27 chars and the first chars can be used to hash > the files. > > My question is : Which filesystem and how to store these files ? I'd also question the architecture and suggest an alternate approach: hierarchical directory tree, database, "nosql" hashing lookup, or other approach. See squid for an example of using directory trees to handle very large numbers of objects. In fact, if you wired things up right, you could probably use squid as a proxy back-end. In general, I'd say a filesystem is the wrong approach to this problem. What's the creation/deletion/update/lifecycle of these objects? Are they all created at once? A few at a time? Are they ever updated? Are they expired and/or deleted? Otherwise, reiserfs and its hashed directory indexes scales well, though I've only pushed it to about 125,000 entries in a single node. There is the usual comment about viability of a filesystem whose principle architect is in jail on a murder charge. It's possible XFS/JFS might also work. I'd suggest you test building and deleting large directories. Incidentally, for testing, 'make -J' can be useful for parallelizing processing, which would also test whether or not locking/contention on the directory entry itself is going to be a bottleneck (I suspect it may be). You might also find that GNU 'find's "-depth" argument is useful for deleting deep/large trees. -- Dr. Ed Morbius, Chief Scientist / | Robot Wrangler / Staff Psychologist | When you seek unlimited power Krell Power Systems Unlimited | Go to Krell!