Hello:
I have a server set up a CentOS 5 server for a client to push files onto using FTP.
I have a cron job to process the files and move them to another directory.
Sometimes, the cron job executes while the client is still uploading a file (Some of them can be large) and I get a partial file.
Is there a way to tell when a file has finished uploading?
I am using the vsftpd daemon installed using yum.
Thanks, Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
Is there a way to tell when a file has finished uploading?
One thought that comes to mind is to upload a file containing the md5sums of the other files.
Your script would then read the check file and compare the md5sum of each file with the one recorded in the check file. Good files are copied, bad files are left until the next time.
Shawn
Shawn:
Thanks for the tip, but I don't think I can ask the client to provide md5s of the input files. That would be too much to ask.
Any other ideas?
Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Shawn Everett Sent: Sunday, November 11, 2007 2:04 PM To: CentOS mailing list Subject: Re: [CentOS] How to know when files have finished FTPing?
Is there a way to tell when a file has finished uploading?
One thought that comes to mind is to upload a file containing the md5sums of the other files.
Your script would then read the check file and compare the md5sum of each file with the one recorded in the check file. Good files are copied, bad files are left until the next time.
Shawn _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sunday 11 November 2007, Neil Aggarwal wrote:
Shawn:
Thanks for the tip, but I don't think I can ask the client to provide md5s of the input files. That would be too much to ask.
Any other ideas?
Neil
I assumed the uploads where automated. It would have been simple in that case. Even using Windows. :)
Are the file names unique? You could test the age of each file to make sure it hasn't been modified in over a minute or something...
Shawn
If you use proftpd it will upload the file under .in.filename and once the transfer has completed, the file is renamed to filename
On Nov 11, 2007 3:34 PM, Barry Brimer lists@brimer.org wrote:
If you use proftpd it will upload the file under .in.filename and once the transfer has completed, the file is renamed to filename
And of course you can ask the client to do this manually if the FTP server doesn't support doing it automatically. Upload to one file name, then rename to another, and only process that second name with the cron job.
You can even have the cron job end by renaming to a third name, so that there's a way to verify that processing was performed correctly.
Neil Aggarwal wrote:
Shawn:
Thanks for the tip, but I don't think I can ask the client to provide md5s of the input files. That would be too much to ask.
Any other ideas?
if you don't need to proces files as soon as possible, you can use a "two passes" process:
The idea is to copy the uploaded file(s), compute a checksum, wait for some period (that is longer than your "idle timeout"), recompute the checksum:
- if the sum is the same, the file is complete (or the client timeouted, but this is an exceptional case and you can't do much about it anyway). remove the uploaded one and process the copy.
- if the sum is different (file changed), remove the copy (and its sum) and wait for next "pass" (if it changed lately, chances are it is open, so don't do anything with it in this pass).
An alternative would be to use lsof to see if the file is bein open by the ftpd process. but I never tried this. if this works, I'd like to hear about it.
In either case, if the client is disconnected during an upload, there is not much to do on the server, unless you have a way to verifiy the files. but in this case, you don't need the above approaches.
if you have some control on the client side, you can recommend the upload of a "sentinel" file at the end of the transfer (say zzzzzzzz.last if upload is performed in alphabetic order). then at each pass, you check for the presence of this, remove it and do your processing.
I used lsof and that is working perfectly.
I called lsof [filename] and checked if there was any output.
Thanks!
Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
An alternative would be to use lsof to see if the file is bein open by the ftpd process. but I never tried this. if this works, I'd like to hear about it.
On Nov 12, 2007 1:01 AM, Neil Aggarwal neil@jammconsulting.com wrote:
I used lsof and that is working perfectly.
I called lsof [filename] and checked if there was any output.
usually we use a lock file :
if [ ! -f /tmp/uploading ] ; then touch /tmp/uploading ... do your stuff ... rm -f /tmp/uploading fi
Thanks!
Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
An alternative would be to use lsof to see if the file is bein open by the ftpd process. but I never tried this. if this works, I'd like to hear about it.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sun, 2007-11-11 at 13:20 -0600, Neil Aggarwal wrote:
...CentOS 5...
I have a cron job...
Is there a way to tell when a file has finished uploading?
Neil Aggarwal wrote:
Is there a way to tell when a file has finished uploading?
how about just watching the logfile rather than the file directory ? the log entry for a file is only written once the file has completed... since it includes the file size and status of transfer.
- KB
On Nov 11, 2007 2:20 PM, Neil Aggarwal neil@jammconsulting.com wrote:
Hello:
I have a server set up a CentOS 5 server for a client to push files onto using FTP.
I have a cron job to process the files and move them to another directory.
Sometimes, the cron job executes while the client is still uploading a file (Some of them can be large) and I get a partial file.
Is there a way to tell when a file has finished uploading?
I am using the vsftpd daemon installed using yum.
Thanks, Neil
Wow, all quite involved answers.
You may be able to get the list of files to process using 'find'. If you check using "-mmin +X", you might be able to get only files that haven't been updated in X minutes. That should filter out anything in progress. If a transfer is stopped in the middle, then restarted later, that wouldn't help here.
It's possible that the FTP daemon might not update this information, so you'll have to expieriment. It's worth a shot before you start telling the client to modify their process.