0

I searched quite a lot on this and found a lot of other questions on this topic which boil down to use a FileSystemWatcher, and in its Changed event, open the file for reading with FileShare.None and check if that throws an exception. However, this only works half the time for me.

I created a simple console application to test this behaviour which essentially boils down to the following:

FileSystemWatcher fsw = new FileSystemWatcher("d:\\locktest");
fsw.Changed += (sender, e) => {
    String fp = e.FullPath;
    Console.WriteLine(fp + " changed");
    try {
        using(FileStream s = new FileStream(fp, FileMode.Open, FileAccess.Read, FileShare.None)) { }
        Console.WriteLine(fp + " complete");
    } catch (IOException ex) {
        Console.WriteLine(fp + " not complete");
    }
};
fsw.Created += (sender, e) => Console.WriteLine(e.FullPath + " created");
fsw.EnableRaisingEvents = true;
Console.ReadKey(); // would immediately exit otherwise

I now tested this with different files and directories, and here are my results:

Small file (1MiB)

Copied via Windows explorer

d:\locktest\1mb created
d:\locktest\1mb changed
d:\locktest\1mb complete
d:\locktest\1mb changed
d:\locktest\1mb complete

Result: Correct, but twice.

Copied via DOS copy

Same as via Windows explorer.

Copied via Cygwin cp

d:\locktest\1mb created
d:\locktest\1mb changed
d:\locktest\1mb complete

Correct result.

1GiB large file

Copied via Windows explorer

d:\locktest\1g created
d:\locktest\1g changed
d:\locktest\1g not complete
d:\locktest\1g changed
d:\locktest\1g complete

Correct result.

Copied via DOS copy

Same result

Copied via Cygwin cp

d:\locktest\1g created
d:\locktest\1g changed
d:\locktest\1g complete

Also a correct result

Large file (10GiB)

Copied via Windows explorer

d:\locktest\10g created
d:\locktest\10g changed
d:\locktest\10g not complete
d:\locktest\10g changed
d:\locktest\10g not complete

The last "changed" event is triggered 'too early", or better: No "changed" event is triggered when the file is complete

Copied via DOS copy

Same result

Copied via Cygwin cp

d:\locktest\10g created
d:\locktest\10g changed
d:\locktest\10g complete

Correct result.

So, how should I approach a reliable detection if a file written to a watched directory is really "complete", i.e., "usable"?

My current approach would be the following:

  1. Listen for Created and Complete
  2. Try to open the file there
  3. If opening the file failed, put this file as a "candidate" into a (synchronized) list of files to be checked again
  4. Check this list every X seconds
  5. If Changed occurrs on one of the files in the list, remove it from there and immediately check it, re-adding it when it fails again

This involves some sort of polling, and looks a bit "hackish" to me. Plus, for very small files, I get the completed information twice, so I figure I have to track already-complete files and check if they have changed since.

Is this the only way to go or is there a better approach?

rabejens
  • 7,594
  • 11
  • 56
  • 104
  • Are all files created by your own code? – Matías Fidemraizer Oct 27 '15 at 11:09
  • Can't you just loop until the file opens? This seems really overkill. – Luke Joshua Park Oct 27 '15 at 11:10
  • No, they are created by third-party applications running on other computers. They write the files to a drive on the machine my application runs on, and that drive is exported to them as a Windows share. – rabejens Oct 27 '15 at 11:11
  • 2
    Unfortunately you cannot guarantee a nice series of distinct events for file operations with the FSW because some applications - like the shell in your example - will perform multiple operations whilst writing a file, each of which is reported. – Alex K. Oct 27 '15 at 11:14
  • Possible duplicate of [FileSystemWatcher vs polling to watch for file changes](http://stackoverflow.com/questions/239988/filesystemwatcher-vs-polling-to-watch-for-file-changes) – Ondrej Svejdar Oct 27 '15 at 11:18
  • It's not completely a duplicate, but the linked post has some useful information. Thanks for that. Since I at least know the files that get written and how to programmatically check if they are complete yet, I figure I should go with polling. – rabejens Oct 27 '15 at 11:22
  • If the files are produced on a schedule, check for the files x minutes after the schedule is supposed to be done. Also, you could check for file size every x seconds, and when you have 3 or 4 consistent sizes, consider it done. – Jeremy Oct 27 '15 at 11:25
  • That is what I thought about when I mentioned polling. – rabejens Oct 27 '15 at 11:25

0 Answers0