11

We currently have one application that monitors a folder for new files. To make it fault tolerant and be able to process more files at once, we want to be able to run multiple instances of this application on different machines. We use File.Move to "lock" a file and make sure that only one thread can process a file at a time.

To test that only one application and/or thread can perform a File.Move on a file, I created a simple application (based on the original application's code), which created 10 threads per application and monitored a folder, when each thread detects a new file, it performs File.Move on it and changes the file's extension, to try and stop other thread's from doing the same.

I have seen an issue when running multiple copies of this application (and it running on its own), whereby 2 threads (either in the same application or different ones), both successfully perform File.Move with no exception thrown, but the thread that performed it last (I change the file's extension to include the DateTime.Now.ToFileTime()), successfully renamed the file. I have looked at what File.Move does and it checks to see if the file exists before it performs the operation, then it calls out to Win32Native.MoveFile to perform the move.

All the other threads/applications throw an exception, as I would expect.

The reasons why this is an issue are:

  1. I thought only 1 thread can perform a File.Move on a file at a time.
  2. I need to reliably have only one application/thread be able to process a file at a time.

Here is the code that performs the File.Move:

public bool TryLock(string originalFile, out string changedFileName)
{
    FileInfo fileInfo = new FileInfo(originalFile);
    changedFileName = Path.ChangeExtension(originalFile, ".original." + DateTime.Now.ToFileTime());
    try
    {
        File.Move(originalFile, changedFileName);
    }
    catch (IOException ex)
    {
        Console.WriteLine("{3} - Thread {1}-{2} File {0} is already in use", fileInfo.Name, Thread.CurrentThread.ManagedThreadId, id, DateTime.Now.ToLongTimeString());
        return false;
    }
    catch (Exception ex)
    {
        Console.WriteLine("{3} - Thread {1}-{2} File {0} error {4}", fileInfo.Name, Thread.CurrentThread.ManagedThreadId, id, DateTime.Now.ToLongTimeString(), ex);
        return false;
    }
    return true;
}

Note - id is just a sequential number I assigned to each thread for logging.

I am running Windows 7 Enterprise SP1 on a SSD with NTFS.

Cœur
  • 37,241
  • 25
  • 195
  • 267
himsy
  • 121
  • 6
  • Why aren't you using a `lock`? – Yuval Itzchakov Jul 28 '14 at 09:20
  • move = copy to new destination, then delete original. – Ankit Jul 28 '14 at 09:26
  • @YuvalItzchakov This should only allow one thread to open the file, but as stated in the answer below, we could get into a race condition when releasing the lock and deleting the file. – himsy Jul 28 '14 at 09:44
  • do you have a distributed architecture ? – BRAHIM Kamel Jul 28 '14 at 09:47
  • @K.B Yes, we have 4 internal production servers, but at the moment only one is active (due to reliable file "locking"). Rest of our architecture is made up of NServiceBus, MSMQ, Azure Worker Roles and Azure Service Bus. – himsy Jul 28 '14 at 09:53
  • Are you moving across volumes or within a volume? In the latter case no copying is necessary. – usr Jul 28 '14 at 10:03
  • @usr In production, once a thread has "locked" a file, we will be moving it across network shares – himsy Jul 28 '14 at 10:06
  • Getting an atomic move requires using MoveFileTransacted(). Microsoft however does not plan to maintain this function so it is important that you solve this in your own code. Threads or processes are going to have to coordinate with each other so they don't try to move the same file at the same time. – Hans Passant Jul 28 '14 at 12:38

4 Answers4

1

From the MSDN description I assume that File.Move does not open the file in exclusive mode.

If you try to move a file across disk volumes and that file is in use, the file is copied to the destination, but it is not deleted from the source.

Anyway, I think you are better off to create your own move mechanism and have it open the file in exclusive mode prior to copying it (and then deleting it):

File.Open(pathToYourFile, FileMode.Open, FileAccess.Read, FileShare.None);

Other threads won't be able to open it if the move operation is already in progress. You might have race condition issues between the moment the copy is finalized (thus you need to dispose of the file handle) and deleting it.

Marcel N.
  • 13,726
  • 5
  • 47
  • 72
  • Thanks @marceln. We have been trying with this method and came to the same conclusion regarding the race condition. Which is what we are trying to avoid in the first place. – himsy Jul 28 '14 at 09:38
  • You can't move the file after you have opened it. – usr Jul 28 '14 at 10:02
  • @usr: I know, that's why I said copy and then delete it. – Marcel N. Jul 28 '14 at 11:25
1

Using File.Move as a lock isn't going to work. As stated in @marceln's answer, it won't delete the source file it is already in use elsewhere and doesn't have a "locking" behavior, you can't relay on it.

What i would suggest is to use a BlockingCollection<T> to manage the processing of your files:

// Assuming this BlockingCollection is already filled with all string file paths
private BlockingCollection<string> _blockingFileCollection = new BlockingCollection<string>();

public bool TryProcessFile(string originalFile, out string changedFileName)
{
    FileInfo fileInfo = new FileInfo(originalFile);
    changedFileName = Path.ChangeExtension(originalFile, ".original." + DateTime.Now.ToFileTime());

    string itemToProcess;
    if (_blockingFileCollection.TryTake(out itemToProcess))
    {
        return false;
    }

    // The file should exclusively be moved by one thread,
    // all other should return false.

    File.Move(originalFile, changedFileName);
    return true;
}
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
1

Are you moving across volumes or within a volume? In the latter case no copying is necessary.

.

@usr In production, once a thread has "locked" a file, we will be moving it across network shares

I'm not sure whether that is a true move or a copy operation. In any case, you could:

  1. open the file exclusively
  2. copy the data
  3. delete the source by handle (Deleting or Renaming a file using an open handle)

That allows you to lock other processes out of that file for the duration of the move. It is more of a workaround than a real solution. Hope it helps.

Note, that for the duration of the move the file is unavailable and other processes will receive an error accessing it. You might need a retry loop with a time delay between operations.

Here's an alternative:

  1. Copy the file to the target folder with a different extension that is being ignored by readers
  2. Atomically rename the file to remove the extension

Renaming on the same volume is always atomic. Readers might receive a sharing violation error for a very short period of time. Again, you need a retry loop or tolerate a very small window of unavailability.

Community
  • 1
  • 1
usr
  • 168,620
  • 35
  • 240
  • 369
1

Based on @marceln and @YuvalItzchakov answer/comments, I tried the following, which seems to give more reliable results:

using (var readFileStream = File.Open(originalFile, FileMode.Open, FileAccess.Read, FileShare.Delete))
{
    readFileStream.Lock(0, readFileStream.Length - 1);
    File.Move(originalFile, changedFileName);
    readFileStream.Unlock(0, readFileStream.Length - 1);
}

I want to use Windows's own file copying as it should be more efficient than copying the stream and in production we will be moving the files from one network share to another.

himsy
  • 121
  • 6