3

To copy a file asynchronously, will something like this work?

let filecopyasync (source, target) =
    let task = Task.Run((fun () ->File.Copy(source, target, true)))

    // do other stuff

    Async.AwaitIAsyncResult task

In particular, will this fire up a new thread to do the copy while I "do other stuff"?

UPDATE:

Found another solution:

let asyncFileCopy (source, target, overwrite) =
    printfn "Copying %s to %s" source target
    let fn = new Func<string * string * bool, unit>(File.Copy)
    Async.FromBeginEnd((source, target, overwrite), fn.BeginInvoke, fn.EndInvoke)

let copyfile1 = asyncFileCopy("file1", "file2", true)
let copyfile2 = asyncFileCopy("file3", "file4", true)

[copyfile1; copyfile2] |> seq |>  Async.Parallel |> Async.RunSynchronously |> ignore
user1443098
  • 6,487
  • 5
  • 38
  • 67

3 Answers3

3

Your question is conflating two issues, namely multithreading and asychrony. It's important to realise that these things are entirely different concepts:

Asychrony is about a workflow of tasks where we respond to the completion of those tasks independently of the main program flow.

Multithreading is an execution model, one which can be used to implement asychrony, although asychrony can be acheived in other ways (such as hardware interrupts).


Now, when it comes to I/O, the question you should not be asking is "Can I spin up another thread to do it for me?"

Why, you ask?

If you do some I/O in the main thread, you typically block the main thread waiting for results. If you evade this problem by creating a new thread, you haven't actually solved the issue, you've just moved it around. Now you've blocked either a new thread that you've created or a thread pool thread. Oh dear, same problem.

Threads are an expensive and valuable resources and shouldn't be squandered on waiting for blocking I/O to complete.

So, what is the real solution?

Well, we achieve asynchrony via one of these other approaches. That way, we can request that the OS perform some I/O and request that it let us know when the I/O operation is complete. That way, the thread is not blocked while we're waiting for results. In Windows, this is implemented via something called I/O completion ports.


How do I do this in F#?

The .NET CopyToAsync method is probably the easiest approach. Since this returns a plain task, it's helpful to create a helper method:

type Async with
    static member AwaitPlainTask (task : Task) =
        task.ContinueWith(ignore) |> Async.AwaitTask

Then

[<Literal>]
let DEFAULT_BUFFER_SIZE = 4096

let copyToAsync source dest =
    async {
        use sourceFile = new FileStream(source, FileMode.Open, FileAccess.Read, FileShare.Read, DEFAULT_BUFFER_SIZE, true);
        use destFile = new FileStream(dest, FileMode.OpenOrCreate, FileAccess.Write, FileShare.None, DEFAULT_BUFFER_SIZE, true);
        do! sourceFile.CopyToAsync(destFile) |> Async.AwaitPlainTask
    }

You could then use this with Async.Parallel to perform multiple copies concurrently.

Note: This is different to what you wrote above because File.Copy is a sychronous method that returns unit while CopyToAsync is an async method that returns Task. You cannot magically make synchronous methods asychronous by putting async wrappers around them, instead you need to make sure you are using async all the way down.

TheInnerLight
  • 12,034
  • 1
  • 29
  • 52
  • Will `CopyToAsync` actually perform an async copy if the streams are created without the `FileOptions.Asynchronous` flag? See e.g. http://stackoverflow.com/a/35467471/82959. – kvb Apr 07 '17 at 14:38
  • @kvb Not sure. Lots of examples suggest this is okay but a quick look at the reference source makes me more doubtful. I have changed it to be on the safe side. – TheInnerLight Apr 07 '17 at 15:37
  • So this looks cool! Many thanks for the explanations too. How would you enhance it to do a given number of retries on an IOException? – user1443098 Apr 07 '17 at 16:56
  • @user1443098 Something like the `RetryRun` function in this answer: http://stackoverflow.com/a/9218869/5438433, then you could e.g. `RetryRun 5 (copyToAsync source dest)` – TheInnerLight Apr 07 '17 at 17:12
  • 1
    @user1443098 Perhaps for a more realistic and robust use though, you would need to think about what might cause such exceptions and how to recover from them by more than simply retrying. – TheInnerLight Apr 07 '17 at 17:14
  • For file copying, I'm thinking about IO errors caused by network outages, file server problems and the like. To recover from them, a retry after a short delay is often the only option. – user1443098 Apr 08 '17 at 13:09
  • There is now `Async.AwaitTask` for the wrapping of the `Task` in `Async`. – sdgfsdh Feb 07 '19 at 23:24
1

You can test it yourself with a few printfns. I found I had to RunAsynchronously to force the main thread to wait for the copy to complete. I'm not sure why the await didn't work, but you can see the expected set of outputs indicating that the copy happened in the background.

open System
open System.IO
open System.Threading
open System.Threading.Tasks
let filecopyasync (source, target) =
    let task = Task.Run((fun () ->
          printfn "CopyThread: %d" Thread.CurrentThread.ManagedThreadId; 
          Thread.Sleep(10000);  
          File.Copy(source, target, true); printfn "copydone"))

    printfn "mainThread: %d" Thread.CurrentThread.ManagedThreadId;
    let result=Async.AwaitIAsyncResult task 
    Thread.Sleep(3000)
    printfn "doing stuff"
    Async.RunSynchronously result
    printfn "done"

Output:

filecopyasync (@"foo.txt",@"bar.txt");;
mainThread: 1
CopyThread: 7
doing stuff
copydone
done
Robert Sim
  • 1,428
  • 11
  • 22
  • That looks good! Next question: Can I determine how big the thread pool is at any given time? I'd like to cap my thread use to avoid overwhelming the system. e.g. if I'm copying many files and some are quite large. Say I wanted to only use 5 threads (arbitrary). Can I find that out somehow? – user1443098 Apr 05 '17 at 00:26
  • .NET tasks use the ThreadPool- the CLR will manage your tasks and schedule them on the pool. If you want to limit the number of tasks you create, this C# answer might help: http://stackoverflow.com/questions/2898609/system-threading-tasks-limit-the-number-of-concurrent-tasks – Robert Sim Apr 05 '17 at 03:06
1

If all you're trying to do is run something on another thread while you do something else, then your initial Task.Run approach should be fine (note that you can get a Task<unit> if you call Task.Run<_> instead of the non-generic Task.Run, which might be marginally easier to deal with).

However, you should be clear about your goals - arguably a "proper" asynchronous file copy wouldn't require a separate .NET thread (which is a relatively heavy-weight primitive) and would rely on operating system features like completion ports instead; since System.IO.File doesn't provide a native CopyAsync method you'd need to write your own (see https://stackoverflow.com/a/35467471/82959 for a simple C# implementation that would be easy to transliterate).

Community
  • 1
  • 1
kvb
  • 54,864
  • 2
  • 91
  • 133