3

It was clearly stated that File.Move is atomic operation here: Atomicity of File.Move.

But the following code snippet results in visibility of moving the same file multiple times.

Does anyone know what is wrong with this code?

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading.Tasks;

namespace FileMoveTest
{
    class Program
    {
        static void Main(string[] args)
        {
            string path = "test/" + Guid.NewGuid().ToString();

            CreateFile(path, new string('a', 10 * 1024 * 1024));

            var tasks = new List<Task>();

            for (int i = 0; i < 10; i++)
            {
                var task = Task.Factory.StartNew(() =>
                {
                    try
                    {
                        string newPath = path + "." + Guid.NewGuid();

                        File.Move(path, newPath);

                        // this line does NOT solve the issue
                        if (File.Exists(newPath))
                            Console.WriteLine(string.Format("Moved {0} -> {1}", path, newPath));
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine(string.Format("  {0}: {1}", e.GetType(), e.Message));
                    }
                });

                tasks.Add(task);
            }

            Task.WaitAll(tasks.ToArray());
        }

        static void CreateFile(string path, string content)
        {
            string dir = Path.GetDirectoryName(path);

            if (!Directory.Exists(dir))
            {
                Directory.CreateDirectory(dir);
            }

            using (FileStream f = new FileStream(path, FileMode.OpenOrCreate))
            {
                using (StreamWriter w = new StreamWriter(f))
                {
                    w.Write(content);
                }
            }
        }
    }
}

The paradoxical output is below. Seems that file was moved multiple times onto different locations. On the disk only one of them is present. Any thoughts?

Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.0018d317-ed7c-4732-92ac-3bb974d29017
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.3965dc15-7ef9-4f36-bdb7-94a5939b17db
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.fb66306a-5a13-4f26-ade2-acff3fb896be
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.c6de8827-aa46-48c1-b036-ad4bf79eb8a9
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.

The resulting file is here:

eb85560d-8c13-41c1-926a-6871be030742.fb66306a-5a13-4f26-ade2-acff3fb896be

UPDATE. I can confirm that checking File.Exists also does NOT solve the issue - it can report that single file was really moved into several different locations.

SOLUTION. The solution I end up with is following: Prior to operations with source file create special "lock" file, if it succeeded then we can be sure that only this thread got exclusive access to the file and we are safe to do anything we want. The below is right set of parameters to create suck "lock" file.

File.Open(lockPath, FileMode.CreateNew, FileAccess.Write);
Community
  • 1
  • 1
Eugene D. Gubenkov
  • 5,127
  • 6
  • 39
  • 71
  • At the end of those operations where actually is the file? Is it in one spot or multiple? I would guess that it is in the last success move's location only. It would shed some light on what is happening. – Mike Zboray Jul 08 '15 at 21:11
  • @mikez, I've added this info to the question - it's 3rd – Eugene D. Gubenkov Jul 08 '15 at 21:15
  • 1
    Interesting, but remember atomic means the system is observable in the starting state (the file is unchanged) or the ending state (the file is in a new location) but not in between. It does not make any guarantee about *when* it is observable in a particular state. It also makes no guarantee about the order in which particular operations are processed. – Mike Zboray Jul 08 '15 at 21:17
  • I believe that the file system behaves as indicated by @Peter duniho. But, I consider that the file move should send exceptions when operation fails (for any reason, even in case of concurent move). In other words, no exception shall be a guarantee of success. If the move behavior was normal (not the case), the code is perfectly correct. A work around is to test that file exists, i.e. File.Exists(newPath). – Graffito Jul 08 '15 at 21:25
  • @Graffito, I've tried File.Exists as post Move validation step and it does not solve the issue - I can get two thread reported that it is moved and target exists, but it does not. Any thoughts? Things becomes much more interesting :) – Eugene D. Gubenkov Jul 10 '15 at 11:00
  • I suspect a SMB2 file system cache issue. Look [here](http://stackoverflow.com/questions/5159220/windows-file-share-why-sometimes-newly-created-files-arent-visible-for-some-pe). It's realy annoying: I also used the "File.Move()" for synchronization purpose in some applications. – Graffito Jul 10 '15 at 11:18

1 Answers1

4

Does anyone know what is wrong with this code?

I guess that depends on what you mean by "wrong".

The behavior you're seeing is not IMHO unexpected, at least if you're using NTFS (other file systems may or may not behave similarly).

The documentation for the underlying OS API (MoveFile() and MoveFileEx() functions) is not specific, but in general the APIs are thread-safe, in that they guarantee the file system will not be corrupted by concurrent operations (of course, your own data could be corrupted, but it will be done in a file-system-coherent way).

Most likely what is occurring is that as the move-file operation proceeds, it does so by first getting the actual file handle from the given directory link to it (in NTFS, all "file names" that you see are actually hard links to an underlying file object). Having obtained that file handle, the API then creates a new file name for the underlying file object (i.e. as a hard link), and then deletes the previous hard link.

Of course, as this progresses, there is a window during the time between a thread having obtained the underlying file handle but before the original hard link has been deleted. This allows some but not all of the other concurrent move operations to appear to succeed. I.e. eventually the original hard link doesn't exist and further attempts to move it won't succeed.

No doubt the above is an oversimplification. File system behaviors can be complex. In particular, your stated observation is that you only wind up with a single instance of the file when all is said and done. This suggests that the API does also somehow coordinate the various operations, such that only one of the newly-created hard links survives, probably by virtue of the API actually just renaming the associated hard link after retrieving the file object handle, as opposed to creating a new one and deleting the old one (implementation detail).


At the end of the day, what's "wrong" with the code is that it is intentionally attempting to perform concurrent operations on a single file. While the file system itself will ensure that it remains coherent, it's up to your own code to ensure that such operations are coordinated so that the results are predictable and reliable.

Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • 1
    Does this all mean that successful competition of File.Move means ... nothing ? :) we should ensure then that that target file really exists via additional calls? – Eugene D. Gubenkov Jul 08 '15 at 21:13
  • I guess that depends on how you wrote the code. If your implementation actually ensured you only ever called `Move()` on a given file in one thread at a time, then the success of the call is (should be) indication of actual success. But in your scenario, yes...each individual operation does in fact succeed, but you are misusing the API and so are getting an undesirable result. If you care about the outcome, either fix your usage, or verify the result after the fact (noting, however, that will still not _guarantee_ success...thread race conditions can still mislead you). – Peter Duniho Jul 08 '15 at 21:19
  • 1
    Oh god, file systems are from the stone age. Not even a move is really atomic. – usr Jul 08 '15 at 21:20
  • _"file systems are from the stone age"_ -- you can say that again. :) That said, it seems to me that to complain that _"Not even a move is really atomic"_, one must first define "atomic". I don't know the specific implementation details, but assuming in NTFS one has to first obtain the actual file object handle, and then rename the hard link to that file object, it could be arguably true to say that the rename is "atomic" at the file system level. From the OS API point of view it's not, but at the file system level it could be. – Peter Duniho Jul 08 '15 at 21:25
  • It's just not what anybody wants. NTFS has a transaction log. They are able to make things atomic if they want to. Databases do it all the time. No excuses for the NTFS team. – usr Jul 08 '15 at 21:47
  • @PeterDuniho, Can you advise an easiest reliable way to ensure that File.Move really succeeded without introducing complex central manager to be a point of synchronization? – Eugene D. Gubenkov Jul 09 '15 at 11:49
  • What does it mean for it to "really succeed" when you are allowing more than one attempt to occur concurrently? Which attempt do you want to actually have effect? What is your _specification_ to unambiguously describe how to select the attempt that should succeed? – Peter Duniho Jul 09 '15 at 15:43
  • @Peterduniho, let me try to define it, I thought it was clear, sorry. I don't want certain concurrent attempt to succeed, I just want to know _which one_ actually done the deal and _which ones_ silently failed. – Eugene D. Gubenkov Jul 09 '15 at 18:47
  • Until you rid yourself of the race condition, I don't see a way to do that. You can call `File.Exists()` after performing the `File.Move()`, and I would expect that would _usually_ give you some indication of success (i.e. if the intended new name is present, it _probably_ succeeded), but I see no reason that depending on thread scheduling, you wouldn't still sometimes see `File.Exists()` return `true`, only to find some other thread still manage to change the name just moments later. – Peter Duniho Jul 09 '15 at 19:07
  • It seems to me that fundamentally, you have a flawed design...multiple threads all trying to rename the same file at the same time, I just cannot see any reason that's a desirable design, never mind correct, in the first place. Even ignoring the immediate concern, it just doesn't seem correct, and the immediate concern seems like reasonably strong proof of the flawed nature of the design. – Peter Duniho Jul 09 '15 at 19:09
  • @PeterDuniho, actually it's not several threads, but several servers running it code that point to same shared folder. As to File.Exsits, it doesn't really change anything (updated post). By the way i've trying creating special ".lock" file while operations with file and it seems to work! Tested with multi-thread environment, not multi-server, but suspect it will be ok as well – Eugene D. Gubenkov Jul 10 '15 at 16:11
  • 1
    Yes, for cross-process synchronization, a "lock" file can work (make sure you use it properly, with the "CreateNew" option, to ensure only one process can ever create it). If the processes are on the same machine, you can use a named mutex for the same result. Both of these are what I'd call solutions that involve a _"complex central manager to be a point of synchronization"_, which you said you were trying to avoid. But yes, synchronizing the processes is the only reliable way to fix the issue. – Peter Duniho Jul 10 '15 at 16:25