90

How can I copy a file in C# without blocking a thread?

Martijn
  • 11,964
  • 12
  • 50
  • 96
user95883
  • 957
  • 1
  • 7
  • 5
  • 1
    I'm a bit puzzled by the closure; seems like the question is very straightforward. – Casey Jul 27 '20 at 03:34
  • @tgdavies this question was written in 2009 (1 year after SO was created). The rules were different back then. You can't expect all the OPs from that era to come back and update their questions to follow the modern rules. Go ahead and update the question yourself if you like. – rory.ap Sep 27 '22 at 12:37
  • @rory.ap doh, didn't look at the date! – tgdavies Sep 27 '22 at 21:58

10 Answers10

58

The idea of async programming is to allow the calling thread (assuming it's a thread pool thread) to return to the thread pool for use on some other task while async IO completes. Under the hood the call context gets stuffed into a data structure and 1 or more IO completion threads monitor the call waiting for completion. When IO completes the completion thread invokes back onto a thread pool restoring the call context. That way instead of 100 threads blocking there is only the completion threads and a few thread pool threads sitting around mostly idle.

The best I can come up with is:

public async Task CopyFileAsync(string sourcePath, string destinationPath)
{
  using (Stream source = File.Open(sourcePath))
  {
    using(Stream destination = File.Create(destinationPath))
    {
      await source.CopyToAsync(destination);
    }
  }
}

I haven't done extensive perf testing on this though. I'm a little worried because if it was that simple it would already be in the core libraries.

await does what I am describing behind the scenes. If you want to get a general idea of how it works it would probably help to understand Jeff Richter's AsyncEnumerator. They might not be completely the same line for line but the ideas are really close. If you ever look at a call stack from an "async" method you'll see MoveNext on it.

As far as move goes it doesn't need to be async if it's really a "Move" and not a copy then delete. Move is a fast atomic operation against the file table. It only works that way though if you don't try to move the file to a different partition.

csaam
  • 1,349
  • 9
  • 9
  • Please can you tell me, what does it mean ( await source.CopyToAsync(destination); ) ? – Khaleel Hmoz Dec 05 '13 at 14:21
  • 2
    Internally what await does in a method marked as aync is wait for the awaited pieced of code to complete. Naively we can say it blocks. However it doesn't really block. Real blocking behavior like a Wait() keeps the active thread stuck at the point of execution. Await actually causes the context of whatever the thread is doing to be stuck into a data structure and allows the active thread to return to the thread pool where it can be used for something else. When await does return a thread pool thread (probably not the same one) retrieves the context and resumes execution. – csaam Jan 15 '14 at 08:24
  • That doesn't sound like that big of deal but using async correctly can mean reducing the number of actual active threads running. In io intensive services this can be a big deal. I've written code that can have 80 active concurrent requests out but only 5 or so active threads. The net result is a higher cpu usage but also higher throughput for my service per instance. So your getting more bang for your buck out of your hardware. – csaam Jan 15 '14 at 08:29
  • 18
    Making these methods `async` must be as difficult as building the f$ing Death Star... this answer's got 2 years now... and nothing changed! No `File.CopyAsync`, no `File.GetInfoAsync`, no `Directory.EnumerateAsync`. – Miguel Angelo Feb 13 '15 at 18:12
  • 3
    If anyone worries about this. Microsoft has an example with the same code so I geuss it must be legit: https://msdn.microsoft.com/en-us/library/hh159084(v=vs.110).aspx – Adam Tal Apr 06 '15 at 18:18
  • 7
    Note that if you don't explicitly open the files with a specific hint that you're going to use it asynchronously (and this doesn't do that), then what happens behind-the-scenes boils down to synchronous writes on the thread pool. See the answer from DrewNoakes for something that does provide such a hint. – Joe Amenta Sep 20 '16 at 12:28
  • I used that code to copy 33 files about 100MB each from a network location to my local hard drive and 2 of them were corrupted. They had the proper file size but at some point they were just filled with zeros so it looks like the copy was aborted. I'm not sure how to check for errors. Do we need to check Task.Status or Task.Exception or Task.IsFaulted or Task.IsCanceled or Task.IsCompleted or all of them? – Slion Aug 28 '18 at 13:13
  • I ended up using RoboSharp: https://github.com/tjscience/RoboSharp Possibly the best c# solution for async file copy on Windows. – Slion Aug 29 '18 at 07:50
  • If you also want to keep the date of the original file (as a normal File.Copy does), you should add: File.SetLastWriteTime(destinationFile, File.GetLastWriteTime(sourceFile)); – Jos Mar 03 '19 at 21:15
  • Note that in case of a file system that support automatic deduplication (APFS, ReFS, both Microsoft and Apple) - this method will not take advantage of that, as it will create a new instance instead of linking under the hood. We need a .NET Core function to implement proper Async file copy. – daniel.gindi Nov 29 '20 at 07:29
41

Here's an async file copy method that gives the OS hints that we're reading and writing sequentially, so that it can pre-fetch data on the read and have things ready for the write:

public static async Task CopyFileAsync(string sourceFile, string destinationFile)
{
    using (var sourceStream = new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
    using (var destinationStream = new FileStream(destinationFile, FileMode.CreateNew, FileAccess.Write, FileShare.None, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
        await sourceStream.CopyToAsync(destinationStream);
}

You can experiment with the buffer size as well. Here's it's 4096 bytes.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • What exactly happens after the first line of code? does it free the thread until the pre-fetching of data from the file is done? – BornToCode Jan 22 '17 at 09:00
  • Runtime imposes no guarantees. Here's what we all hope will happen: if the request can be serviced without waiting on external resources, then await will complete synchronously. Otherwise state will be captured, threading context and all, thread will yield, and continuation will run once request completes. In my enhanced code below, threading context is not captured. This means that possibly a different thread from I/O completion pool will run. – GregC Apr 06 '17 at 15:07
  • [You might want to use a much larger buffer than just 4096 bytes](https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/). – Dai Mar 25 '23 at 03:14
16

I've enhanced code by @DrewNoakes slightly (performance and cancellation):

  public static async Task CopyFileAsync(string sourceFile, string destinationFile, CancellationToken cancellationToken)
  {
     var fileOptions = FileOptions.Asynchronous | FileOptions.SequentialScan;
     var bufferSize = 4096;

     using (var sourceStream = 
           new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize, fileOptions))

     using (var destinationStream = 
           new FileStream(destinationFile, FileMode.CreateNew, FileAccess.Write, FileShare.None, bufferSize, fileOptions))

        await sourceStream.CopyToAsync(destinationStream, bufferSize, cancellationToken)
                                   .ConfigureAwait(continueOnCapturedContext: false);
  }
GregC
  • 7,737
  • 2
  • 53
  • 67
  • 4
    This might be misleading, if we're working on gui app we want to return to captured context. This should be user decision, a level higher ( `await CopyFileAsync().ConfigureAwait(false)` – Jakoss Apr 13 '17 at 10:05
  • 1
    I agree. Base Class Library team recommends postponing the context capturing behavior configuration to caller. The code doesn't capture context. – GregC Apr 13 '17 at 16:09
  • 4
    Setting the buffer size to 4096 in `CopyToAsync` greatly reduces the speed when writing to a network share. Using the default of 81920 is a better option, in my case the speed went from 2 Mbps to 25 Mbps. See [this related question](https://stackoverflow.com/questions/14587494/writing-to-file-using-streamwriter-much-slower-than-file-copy-over-slow-network) for an explanation. – user247702 Jan 11 '19 at 15:01
  • 4
    @Nekromancer Actually, using `await sourceStream.CopyToAsync().ConfigureAwait(false)` here is correct, because the remaining method code (nothing) doesn't care about in which context it runs. Your calling method uses it's own `await CopyFileAsync()` with it's own `ConfigureAwait()`, which will be set to `true` if not set explicitly. – lauxjpn Sep 03 '19 at 11:49
  • 1
    @user247702 This should be only 64K = 65536‬ bytes according to your linked question: _"Increasing the buffer size beyond 64k will not help in any circumstance, as the underlying SMB protocol does not support buffer lengths beyond 64k."_ – lauxjpn Sep 03 '19 at 11:57
  • 1
    @user247702 actually the `DefaultBufferSize` for [System.IO.FileSystem.Unix.cs](https://source.dot.net/#System.IO.FileSystem/System/IO/FileSystem.Unix.cs) is **4096**! – Søren May 19 '21 at 00:09
  • @lauxjpn but that's an implementation detail of the networking protocol. Something a user of this code may not run into if all they're doing is copying a file locally. – Tom Lint Apr 01 '22 at 08:27
13

While there are some circumstances where you'd want to avoid Task.Run, Task.Run(() => File.Move(source, dest) will work. It is worth considering because when a file is simply moved in the same disk/volume, it is an almost-instantaneous operation, as the headers are changed but the file contents are not moved. The various "pure" async methods invariably copy the stream, even when there is no need to do this, and as a result can be quite a bit slower in practice.

Casey
  • 3,307
  • 1
  • 26
  • 41
  • The problem is that when moving files on the same volume and it simply changes the headers, this uses an unnecessary thread. – IS4 Sep 21 '17 at 13:27
  • 2
    @IllidanS4 That's unfortunate but we're potentially talking about saving several minutes, if your files are big enough. – Casey Oct 01 '18 at 01:22
  • 1
    -1. Moving and copying are two different operations, and the OP specifically asked about the latter, which doesn't involve the disappearance of the source file. – Tom Lint Apr 01 '22 at 08:14
  • 1
    @TomLint If you look at the edit history you will see that the question originally read "how to copy/move a file." I can't help that someone changed the question four years after I answered it. – Casey Apr 04 '22 at 19:04
  • @Casey revisionist history generally wins, you better figure out a strategy to keep contextual tabs on everything you do in your world man... (Sorry.. I found your response too funny to pass up) Actually your answer here served my purposes well, so thank you for that – PhideasAmbrosianus Feb 24 '23 at 18:56
8

You can use asynchronous delegates

public class AsyncFileCopier
    {
        public delegate void FileCopyDelegate(string sourceFile, string destFile);

        public static void AsynFileCopy(string sourceFile, string destFile)
        {
            FileCopyDelegate del = new FileCopyDelegate(FileCopy);
            IAsyncResult result = del.BeginInvoke(sourceFile, destFile, CallBackAfterFileCopied, null);
        }

        public static void FileCopy(string sourceFile, string destFile)
        { 
            // Code to copy the file
        }

        public static void CallBackAfterFileCopied(IAsyncResult result)
        {
            // Code to be run after file copy is done
        }
    }

You can call it as:

AsyncFileCopier.AsynFileCopy("abc.txt", "xyz.txt");

This link tells you the different techniques of asyn coding

Rashmi Pandit
  • 23,230
  • 17
  • 71
  • 111
  • 6
    I think the question was to do the operation asynchronously, without consuming a thread. There are multiple ways to delegate work to the threadpool, most of which are easier than the mechanism here. – John Melville Jan 28 '12 at 05:21
5

You can do it as this article suggested:

public static void CopyStreamToStream(
    Stream source, Stream destination,
    Action<Stream, Stream, Exception> completed)
    {
        byte[] buffer = new byte[0x1000];
        AsyncOperation asyncOp = AsyncOperationManager.CreateOperation(null);

        Action<Exception> done = e =>
        {
            if(completed != null) asyncOp.Post(delegate
                {
                    completed(source, destination, e);
                }, null);
        };

        AsyncCallback rc = null;
        rc = readResult =>
        {
            try
            {
                int read = source.EndRead(readResult);
                if(read > 0)
                {
                    destination.BeginWrite(buffer, 0, read, writeResult =>
                    {
                        try
                        {
                            destination.EndWrite(writeResult);
                            source.BeginRead(
                                buffer, 0, buffer.Length, rc, null);
                        }
                        catch(Exception exc) { done(exc); }
                    }, null);
                }
                else done(null);
            }
            catch(Exception exc) { done(exc); }
        };

        source.BeginRead(buffer, 0, buffer.Length, rc, null);
Pablo Retyk
  • 5,690
  • 6
  • 44
  • 59
  • 2
    Streams now have a built-in copy operation that makes this much easier. But my problem with this technique is that it always copies the file, even when it's on the same disk and no such operation is necessary. – Casey Aug 04 '16 at 01:52
2

I implemented this solution for copying large files (backup files) and it's terribly slow! For smaller files, it's not a problem, but for large files just use File.Copy or an implementation of robocopy with parameter /mt (multithread).

Note that this, copy file async, is still an open issue for .net development: https://github.com/dotnet/runtime/issues/20695

Gregory Liénard
  • 1,071
  • 3
  • 7
2

AFAIK, there is no high level async API to copy a file. However, you can build your own API to accomplish that task using Stream.BeginRead/EndRead and Stream.BeginWrite/EndWrite APIs. Alternatively, you can use BeginInvoke/EndInvoke method as mentioned in the answers here, but you have to keep in mind, that they won't be non blocking async I/O. They merely perform the task on a separate thread.

Charles Prakash Dasari
  • 4,964
  • 1
  • 27
  • 46
-4

I would suggest that the File Copy IO function, available in the .Net programming languages, is asynchronous in any case. After using it within my program to move small files, it appears that subsequent instructions begin to execute before the actual file copy is finished. I'm gussing that the executable gives Windows the task to do the copy and then immediately returns to execute the next instruction - not waiting for Windows to finish. This forces me to construct while loops just after the call to copy that will execute until I can confirm the copy is complete.

Tom
  • 1
  • 5
    The reason that works is that if you're moving a file within the same drive nothing needs to be rewritten except the headers. If you move to a different drive you can convince yourself that it is not an asynchronous operation. – Casey Mar 31 '15 at 11:30
  • And to extend Casey's response, copying files over VPN or WAN in general are a LOT slow still. – Chris Walsh Apr 23 '15 at 22:24
-6

The correct way to copy: use a separate thread.

Here's how you might be doing it (synchronously):

//.. [code]
doFileCopy();
// .. [more code]

Here's how to do it asynchronously:

// .. [code]
new System.Threading.Thread(doFileCopy).Start();
// .. [more code]

This is a very naive way to do things. Done well, the solution would include some event/delegate method to report the status of the file copy, and notify important events like failure, completion etc.

cheers, jrh

jrharshath
  • 25,975
  • 33
  • 97
  • 127