21

My application is keeping watch on a set of folders where users can upload files. When a file upload is finished I have to apply a treatment, but I don't know how to detect that a file has not finish to upload.

Any way to detect if a file is not released yet by the FTP server?

Bengi Besçeli
  • 3,638
  • 12
  • 53
  • 87
user2223898
  • 441
  • 1
  • 3
  • 15

4 Answers4

27

There's no generic solution to this problem.

Some FTP servers lock the file being uploaded, preventing you from accessing it, while the file is still being uploaded. For example IIS FTP server does that. Most other FTP servers do not. See my answer at Prevent file from being accessed as it's being uploaded.


There are some common workarounds to the problem (originally posted in SFTP file lock mechanism, but relevant for the FTP too):

  • You can have the client upload a "done" file once the upload finishes. Make your automated system wait for the "done" file to appear.

  • You can have a dedicated "upload" folder and have the client (atomically) move the uploaded file to a "done" folder. Make your automated system look to the "done" folder only.

  • Have a file naming convention for files being uploaded (".filepart") and have the client (atomically) rename the file after upload to its final name. Make your automated system ignore the ".filepart" files.

    See (my) article Locking files while uploading / Upload to temporary file name for an example of implementing this approach.

    Also, some FTP servers have this functionality built-in. For example ProFTPD with its HiddenStores directive.

  • A gross hack is to periodically check for file attributes (size and time) and consider the upload finished, if the attributes have not changed for some time interval.

  • You can also make use of the fact that some file formats have clear end-of-the-file marker (like XML or ZIP). So you know, that the file is incomplete.


Some FTP servers allow you to configure a hook to be called, when an upload is finished. You can make use of that. For example ProFTPD has a mod_exec module (see the ExecOnCommand directive).

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
6

I use ftputil to implement this work-around:

  1. connect to ftp server
  2. list all files of the directory
  3. call stat() on each file
  4. wait N seconds
  5. For each file: call stat() again. If result is different, then skip this file, since it was modified during the last seconds.
  6. If stat() result is not different, then download the file.

This whole ftp-fetching is old and obsolete technology. I hope that the customer will use a modern http API the next time :-)

guettli
  • 25,042
  • 81
  • 346
  • 663
  • It's easy to blame this on an old protocol but that problem is not very different with HTTP. – tcurdt Sep 11 '18 at 20:23
  • @tcurdt if the customer does use http to send us files, then I can handle the receiving part. I can act immediately, I can validate it, I can reject the file if it is not valid ... At least in my context (I use Django). – guettli Sep 12 '18 at 11:00
  • you are talking about writing a http service. If you write your own ftp service you can do the same. That's a non-argument. Try a put on a webdav server. FWIW proftpd even has a config option to turn on two-stage uploads. This is a question of implementation - not protocol. – tcurdt Sep 12 '18 at 14:22
  • @tcurdt yes, you are right. Can you give me an advice for a easy to use framework to implement a ftp service, which allows me to validate the data before accepting it? – guettli Sep 12 '18 at 14:35
  • @guettli thanks for the answer. How much time do you usually wait before calling the stat() again? I tried with 10 seconds, but apparently that's not long enough to refresh the stats – haralambov Oct 25 '18 at 19:05
  • 1
    @Brood 60 seconds. – guettli Oct 26 '18 at 08:07
2

If you are reading files of particular extensions, then use WINSCP for File Transfer. It will create a temporary file with extension .filepart and it will turn to the actual file extension once it fully transfer the file.

I hope, it will help someone.

Rama Krishna
  • 275
  • 3
  • 12
0

This is a classic problem with FTP transfers. The only mostly reliable method I've found is to send a file, then send a second short "marker" file just to tell the recipient the transfer of the first is complete. You can use a file naming convention and just check for existence of the second file.

You might get fancy and make the content of the second file a checksum of the first file. Then you could verify the first file. (You don't have the problem with the second file because you just wait until file size = checksum size).

And of course this only works if you can get the sender to send a second file.

A. L. Flanagan
  • 1,162
  • 8
  • 22
  • 1
    In my case, I can't influence the ftp uploading application and I can't influence the ftp server :-( ... but it is solvable (see above) – guettli Apr 28 '17 at 13:27
  • And a clever solution it is -- but I don't know if I really trust it (but then, I'm paranoid). You're right, of course, FTP sux. – A. L. Flanagan Apr 28 '17 at 17:15