18

I'm implementing a class in C# that is supposed to monitor a directory, process the files as they are dropped then delete (or move) the processed file as soon as processing is complete. Since there can be multiple threads running this code, the first one that picks up the file, locks it exclusively, so no other threads will read the same file and no external process or user can access in any way. I would like to keep the lock until the file is deleted/moved, so there's no risk of another thread/process/user accessing it.

So far, I tried 2 implementation options, but none of them works as I want.

Option 1

FileStream fs = file.Open(FileMode.Open, FileAccess.Read, FileShare.Delete);
//Read and process
File.Delete(file.FullName); //Or File.Move, based on a flag
fs.Close();

Option 2

FileStream fs = file.Open(FileMode.Open, FileAccess.Read, FileShare.None);
//Read and process
fs.Close();
File.Delete(file.FullName); //Or File.Move, based on a flag

The issue with Option 1 is that other processes can access the file (they can delete, move, rename) while it should be fully locked.

The issue with Option 2 is that the file is unlocked before being deleted, so other processes/threads can lock the file before the delete happens, so the delete will fail.

I was looking for some API that can perform the delete using the file handle I already have exclusive access.

Edit

The directory being monitored resides in a pub share, so other users and processes have access to it. The issue is not managing the locks within my own process. The issue I'm trying to solve is how to lock a file exclusively then move/delete it without releasing the lock

GDemartini
  • 341
  • 1
  • 3
  • 6
  • 1
    can you rename it to something that the other threads won't pick up, before processing it? – Moose Mar 26 '13 at 03:38
  • File names can be anytthing, so renaming wouldn't help. Even the file name was important, i want to prevent any other process or user that has access to the folder from touching it while it's locked. – GDemartini Mar 27 '13 at 17:53

6 Answers6

8

Two solutions come to mind.

The first and simplest is to have the thread rename the file to something that the other threads won't touch. Something like "filename.dat.<unique number>", where <unique number> is something thread-specific. Then the thread can party on the file all it wants.

If two threads get the file at the same time, only one of them will be able to rename it. You'll have to handle the IOException that occurs in the other threads, but that shouldn't be a problem.

The other way is to have a single thread monitoring the directory and placing file names into a BlockingCollection. Worker threads take items from that queue and process them. Because only one thread can get that particular item from the queue, there is no contention.

The BlockingCollection solution is a little bit (but only a little bit) more complicated to set up, but should perform better than a solution that has multiple threads monitoring the same directory.

Edit

Your edited question changes the problem quite a bit. If you have a file in a publicly accessible directory, it's at risk of being viewed, modified, or deleted at any point between the time it's placed there and the time your thread locks it.

Since you can't move or delete a file while you have it open (not that I'm aware of), your best bet is to have the thread move the file to a directory that's not publicly accessible. Ideally to a directory that's locked down so that only the user under which your application runs has access. So your code becomes:

File.Move(sourceFilename, destFilename);
// the file is now in a presumably safe place.
// Assuming that all of your threads obey the rules,
// you have exclusive access by agreement.

Edit #2

Another possibility would be to open the file exclusively and copy it using your own copy loop, leaving the file open when the copy is done. Then you can rewind the file and do your processing. Something like:

var srcFile = File.Open(/* be sure to specify exclusive access */);
var destFile = File.OpenWrite(/* destination path */);
// copy the file
var buffer = new byte[32768];
int bytesRead = 0;
while ((bytesRead = srcFile.Read(buffer, 0, buffer.Length)) != 0)
{
    destFile.Write(buffer, 0, bytesRead);
}
// close destination
destFile.Close();
// rewind source
srcFile.Seek(0, SeekOrigin.Start);
// now read from source to do your processing.
// for example, to get a StreamReader, just pass the srcFile stream to the constructor.

You can process and then copy, sometimes. It depends on if the stream stays open when you're finished processing. Typically, code does something like:

using (var strm = new StreamReader(srcStream, ...))
{
    // do stuff here
}

That ends up closing the stream and the srcStream. You'd have to write your code like this:

using (var srcStream = new FileStream( /* exclusive access */))
{
    var reader = new StreamReader(srcStream, ...);
    // process the stream, leaving the reader open
    // rewind srcStream
    // copy srcStream to destination
    // close reader
}

Doable, but clumsy.

Oh, and if you want to eliminate the potential of somebody reading the file before you can delete it, just truncate the file at 0 before you close it. As in:

srcStream.Seek(0, SeekOrigin.Begin);
srcStream.SetLength(0);

That way if somebody does get to it before you get around to deleting it, there's nothing to modify, etc.

Rowland Shaw
  • 37,700
  • 14
  • 97
  • 166
Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Neither of these solutions would prevent external users/processes from renaming or deleting a locked file (locked with `FileShare.Delete`). Managing the locks within my own process would be trivial using `BlockingCollection` or some other synchronization mechanism. – GDemartini Apr 01 '13 at 17:38
  • Having users view/modify/delete the file before it's picked up is fine (the files are actually manually dropped, so if someone drops a file by mistake and is lucky enough to delete before it's locked, it's ok). The proposed solution of moving the file to a safe place would work, even though it comes with a performance hit (it would be copying from a network share to a local directory and then back to the same network share into an archive folder). Thanks, @jim-mischel, if I really don't find a way to move the file "re-using" the handle, I'll flag this as the accepted answer. – GDemartini Apr 01 '13 at 19:55
  • @GDemartini: Just curious why the "safe" directory can't be on the network share. Can't you set permissions to restrict access there? – Jim Mischel Apr 02 '13 at 03:21
  • I'm not the owner of the share, so I shouldn't create a folder or change its structure, even though I have permission on the file system to do so. – GDemartini Apr 02 '13 at 17:38
  • According to the [documentation for File.Move](https://msdn.microsoft.com/en-us/library/system.io.file.move(v=vs.110).aspx) >If you try to move a file across disk volumes and that file is in use, the file is copied to the destination, but it is not deleted from the source. – user963263 Jan 12 '17 at 20:10
7

Here is the most robust way I know of that will even work correctly if you have multiple processes on multiple servers working with these files.

Instead of locking the files themselves, create a temporary file for locking, this way you can unlock/move/delete the original file without problems, but still be sure that at least any copies of your code running on any server/thread/process will not try to work with the file at the same time.

Psuedo code:

try
{
    // get an exclusive cross-server/process/thread lock by opening/creating a temp file with no sharing allowed
    var lockFilePath = $"{file}.lck";
    var lockFile = File.Open(lockFilePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);

    try
    {
        // open file itself with no sharing allowed, in case some process that does not use our locking schema is trying to use it
        var fileHandle = File.Open(file, FileMode.Open, FileAccess.Read, FileShare.None);

        // TODO: add processing -- we have exclusive access to the file, and also the locking file

        fileHandle.Close();

        // at this point it is possible for some other process that does not use our locking schema to lock the file before we
        //  move it, causing us to process this file again -- we would always have to handle issues where we failed to move
        //  the file anyway (maybe we just lost power, or crashed?) so we had to design around this no matter what

        File.Move(file, archiveDestination);
    }
    finally
    {
        lockFile.Close();

        try
        {
            File.Delete(lockFilePath);
        }
        catch (Exception ex)
        {
            // another process opened locked file after we closed it, before it was deleted -- safely ignore, other process will delete lock file
        }
    }
}
catch (Exception ex)
{
    // another process already has exclusive access to the lock file, we don't need to do anything
    // or we failed while processing, in which case we did not move the file so it will be tried again by this process or another
}

One nice thing about this pattern is it can also be used for times when locking is supported by the file storage. For example, if you were trying to process files on an FTP/SFTP server, you could make your temporary locking files use a normal drive (or SMB share) -- since the locking files do not have to be in the same location as the files themselves.

I can't take credit for the idea, it's been around longer than the PC, and used by plenty of apps like Microsoft Word, Excel, Access, and most older database systems. Read: well tested.

eselk
  • 6,764
  • 7
  • 60
  • 93
4

The file system itself is volatile in nature so it's very difficult to try and do what you want. This is a classic race condition in the file system. With option 2, you could alternatively move the file to a "processing" or staging directory that you create before doing your work. YMMV on performance but you could at least benchmark it to see if it could fit your needs.

Bryan Crosby
  • 6,486
  • 3
  • 36
  • 55
3

You may need to implement some form of shared / synchronised List from the spawning thread. If the parent thread keeps track of files by periodically checking the directory, it can then hand them off to child threads and that'll eliminate the locking problem.

iambeanie
  • 315
  • 1
  • 8
  • It would work if my process were the only one with access to the files, but they're accessible to other users/processes. – GDemartini Apr 01 '13 at 17:41
0

This solution, thought not 100% water tight, may well get you what you need. (It did for us.)

Use two locks that together give you exclusive access to the file. When you are ready to delete the file, you release one of them, then deleted the file. The remaining lock will still prevent most other processes from obtaining a lock.

FileInfo file = ...

// Get read access to the file and only allow other processes write or delete access.
// Keeps others from locking the file for reading.
var readStream = file.Open(FileMode.Open, FileAccess.Read, FileShare.Write | FileShare.Delete);
FileStream preventWriteAndDelete;
try
{
    // Now try to get a lock on than only allows others to read the file.  We can acquire both
    // locks because they each allow the other.  Together, they give us exclusive access to the
    // file.
    preventWriteAndDelete = file.Open(FileMode.Open, FileAccess.Write, FileShare.Read);
}
catch
{
    // We couldn't get the second lock, so release the first.
    readStream.Dispose();
    throw;
}

Now you can read the file (with readStream). If you need to write to it, you'll have to do that with the other stream.

When you are ready to delete the file, you first release the lock that prevents writing and deletion while still holding the lock that prevents reading.

preventWriteAndDelete.Dispose(); // Release lock that prevents deletion.
file.Delete();
// This lock specifically allowed deletion, but with the file gone, we're done with it now.
readStream.Dispose(); 

The only opportunity for another process (or thread) to get a lock on the file is if it requests a shared write lock, one which gives it write-only access and also allows others to write to the file. This is not very common. Most processes attempt either a shared read lock (read access allowing others to read, but not write or delete) or an exclusive write lock (write or read/write access with no sharing). Both of these common scenarios will fail. A shared read/write lock (requesting read/write access and allowing others the same) will also fail.

In addition, the window of opportunity for a process to request and acquire a shared write lock is very small. If a process is hammering away trying to acquire such a lock, then it may succeed, but few applications do this. So unless you have such an application in your scenario, this strategy should meet your needs.

You can also use the same strategy to move the file.

preventWriteAndDelete.Dispose();
file.MoveTo(destination);
readStream.Dispose();
Scott
  • 4,458
  • 1
  • 19
  • 27
-1

You could use the MoveFileEx API function to mark the file for deletion upon next reboot. Source

Community
  • 1
  • 1
Alina B.
  • 1,256
  • 8
  • 18