Neil Aggarwal wrote:
Shawn:
Thanks for the tip, but I don't think I can ask the client to provide md5s of the input files. That would be too much to ask.
Any other ideas?
if you don't need to proces files as soon as possible, you can use a "two passes" process:
The idea is to copy the uploaded file(s), compute a checksum, wait for some period (that is longer than your "idle timeout"), recompute the checksum:
- if the sum is the same, the file is complete (or the client timeouted, but this is an exceptional case and you can't do much about it anyway). remove the uploaded one and process the copy.
- if the sum is different (file changed), remove the copy (and its sum) and wait for next "pass" (if it changed lately, chances are it is open, so don't do anything with it in this pass).
An alternative would be to use lsof to see if the file is bein open by the ftpd process. but I never tried this. if this works, I'd like to hear about it.
In either case, if the client is disconnected during an upload, there is not much to do on the server, unless you have a way to verifiy the files. but in this case, you don't need the above approaches.
if you have some control on the client side, you can recommend the upload of a "sentinel" file at the end of the transfer (say zzzzzzzz.last if upload is performed in alphabetic order). then at each pass, you check for the presence of this, remove it and do your processing.