On Mon, 25 Jul 2011, Marc Deop wrote: > It's more than twice as fast than the previous sh script. In part this is /bin/sh v /bin/bash and using 'bashisms' matter, but yes, I did not seek to optimize a teaching throwaway > 1- m5sum the file we need ... actually the NAME of the file, to make it explicit we are not looking at content [also a reasonable approach if one is looking to find and de-duplicate a filestore] > 2- look for the first letter of the hash ... actually this may be more than a single letter of the hash --- with ca 3000 files, and 16 hash characters, we should end up with about 200 files per subdirectory. The filesystem should be doing some sort of index as well -- as I recall, a B-tree in the case of extN but I've not expressly looked. The php case was mentioned, however, and its directory searching is less optimal We have a customer with a similar problem with a naiively written set of home brewed PHP code, and are helping them work through similar issues > 3- get into the directory > 4- now we look for our file ... this is probably a single operation to suck the sub-directory listing into an array in php, and use an associative match but you are right, we are moving increasingly away from a CentOS issue to a more general coding style issue -- Russ herrold