3

Consider we have 2 I/O bound tasks that need to be processed, for N amount of elements. We can call the 2 tasks A and B. B can only be run after A has produced a result.

We can accomplish this in two ways. (Please ignore cases of Access to modified closure.)

Task.Run way:

List<Task> workers = new List<Task>();
for (int i = 0; i < N; i++)
{
    workers.Add(Task.Run(async () =>
    {
        await A(i);
        await B(i);
    }
}
await Task.WhenAll(workers);

Classic Fork/Join:

List<Task> workersA = new List<Task>();
List<Task> workersB = new List<Task>();
for (int i = 0; i < N; i++)
{
    workersA.Add(A(i));
}

await Task.WhenAll(workersA);

for (int i = 0; i < N; i++)
{
    workersB.Add(B(i));
}

await Task.WhenAll(workersB);

Alternatively this can be done also in the following way:

List<Task> workers = new List<Task>();

for (int i = 0; i < N; i++)
{
    workers.Add(A(i));
}

for (int i = 0; i < N; i++)
{
    await workers[i];
    workers[i] = B(i);
}

await Task.WhenAll(workers);

My concerns are that the following MSDN docs state that we should never use Task.Run for I/O operations.

Taking that into consideration, what's the best approach to handle this case then?

Correct me if I'm wrong, but we want to avoid using Task.Run, because we effectively queue Threads to handle the work, where if we just use await, there won't be any thread. (Due to the operations being I/O.)

I really wish to go down the Task.Run route, but if it ends up using threads for no apparent reason/does additional overhead, then It's a no-go.

SpiritBob
  • 2,355
  • 3
  • 24
  • 62

2 Answers2

4

I really wish to go down the Task.Run route

Why?

but if it ends up using threads for no apparent reason, then It's a no-go.

The documentation says it:

Queues the specified work to run on the ThreadPool

That doesn't necessarily mean a brand new thread for every time you call Task.Run. It might, but not necessarily. All you can guarantee is that it will run on a thread that is not the current one.

You have no control over how many threads get created to do all that work. But the recommendation to not use Task.Run for I/O operations is sound. It's needless overhead for no gain. It will be less efficient.

Either of your other solutions would work fine. Your last solution might finish quicker since you are starting the calls to B() sooner (you only wait for the first A() to finish before starting to call B() instead of waiting for them all to complete).

Update based on Theodor's answer: We're both right :) It's important to know that all the code in an async method before the first await (and the code after, unless you specify otherwise) will run in the same context it was started from. In a desktop app, that's the UI thread. The waiting is asynchronous. So the UI thread is freed while waiting. But if there is any CPU-heavy work in that method, it will lock the UI thread.

So Theodor is saying that you can use Task.Run to get it off the UI thread ASAP and guarantee it will never lock the UI thread. While that's true, you cannot blindly use that advice everywhere. For one, you may need to do something in the UI after the I/O operation, and that must be done on the UI thread. If you've run it with Task.Run, then you have to make sure to marshall back to the UI thread for that work.

But if the async method you call has enough CPU-bound work that it freezes the UI, then it's not strictly an I/O operation and the advice of "Use Task.Run for CPU-bound work, and async/await for I/O" still fits.

All I can say is: Try it. If you find that whatever you're doing freezes the UI, then use Task.Run. If you find that it doesn't, then Task.Run is needless overhead (not much, mind you, but still needless, but gets worse if you're doing it in a loop like you are).

And all that really applies to desktop apps. If you're in ASP.NET then Task.Run won't do anything for you unless you're trying to do something in parallel. In ASP.NET, there is no "UI thread", so it doesn't matter which thread you do the work on. You just want to make sure you don't lock the thread while waiting (since there are a limited number of threads in ASP.NET).

Gabriel Luci
  • 38,328
  • 4
  • 55
  • 84
  • Could you please check on Theodor Zoulias's answer? Is it okay, and most efficient to use `Task.Run(async`, or is there a misconception? Your current answer clearly states to avoid it, so one of the two answers must have some flaws. – SpiritBob Nov 09 '19 at 08:07
  • My comment was getting too long so I updated my answer. – Gabriel Luci Nov 09 '19 at 13:49
  • Thank you! In my specific case it's regarding an ASP.NET application. – SpiritBob Nov 10 '19 at 07:34
  • To make sure I understood you - if we use `Task.Run` with an `async` delegate, in reality if the operation is truly I/O, then the thread will be immediately freed upon hitting the `await`. Effectively working as if no Task.Run was launched, because that's what would happen if we just awaited the I/O operation? If all of that's correct. If we have a small synchronous code tied in the I/O call, would you recommend using Task.Run for that case? (ASP.NET application) Thank you for your insight! – SpiritBob Nov 11 '19 at 09:24
  • `Task.Run` moves execution to another thread. The `await` keyword frees the current thread while waiting. So if you `await Task.Run`, and you `await` something inside of the `async` delegate you called with `Task.Run`, then both threads will be freed while waiting. – Gabriel Luci Nov 11 '19 at 12:00
  • In ASP.NET, the *only* use case for `Task.Run` is to run something in parallel - move something into another thread, don't `await` until you've done some other work on the current thread - like this: `var myTask = Task.Run(() => whatever()); /* do something else */ await myTask;` – Gabriel Luci Nov 11 '19 at 12:05
  • Using `await Task.Run` has no value in ASP.NET. There is no point moving execution to another thread just to immediately wait for it. That's true whether what you're calling is I/O or CPU bound. – Gabriel Luci Nov 11 '19 at 12:11
  • Indeed, I fully understand you! But I was referring to my original question when involving a list of tasks. Whether it's okay to populate the list of tasks with `Task.Run`'s, or to simply pass a method combining the two async I/O tasks (not truly I/O, there IS a small synchronous code segment to be executed, in order to setup the true I/O operation.) – SpiritBob Nov 11 '19 at 12:25
  • I see. I looked at your question again. Using `Task.Run` will run the synchronous parts of the code in parallel. Whether that is "more efficient" depends on your code (does the benefit of running in parallel outweigh the cost of switching threads?). The only real way to know is to try both ways. You may find that it doesn't make any discernible difference in execution time. – Gabriel Luci Nov 11 '19 at 12:57
  • Yeah, it definitely wouldn't. Thank you! – SpiritBob Nov 11 '19 at 13:02
3

If the work you have is I/O-bound, use async and await without Task.Run. You should not use the Task Parallel Library. The reason for this is outlined in the Async in Depth article.

This piece of advice, although it comes from the site of Microsoft, is misleading. By discouraging Task.Run for I/O operations, the author had probably this in mind:

var data = await Task.Run(() =>
{
    return webClient.DownloadString(url); // Blocking call
});

...which is indeed bad because it blocks a thread-pool thread. But using Task.Run with an async delegate is perfectly fine:

var data = await Task.Run(async () =>
{
    return await webClient.DownloadStringTaskAsync(url); // Async call
});

Actually in my opinion this is the preferred way of initiating asynchronous operations from the event handlers of a UI application, because it ensures that the UI thread will be freed immediately. If instead you follow the article's advice and omit the Task.Run:

private async void Button1_Click(object sender, EventArgs args)
{
    var data = await webClient.DownloadStringTaskAsync(url);
}

...then you risk that the async method may not be 100% async, and may block the UI thread. This is a tiny concern for built-in async methods like the DownloadStringTaskAsync that is written by experts, but becomes a greater concern for 3rd party async methods, and an even greater concern for async methods written by the developers themselves!

So regarding the options of your question, I believe that the first one (Task.Run way) is the safest and the most efficient. The second one will await separately all A and all B tasks, so the duration will be at best Max(A) + Max(B). Which statistically should be longer than Max(A + B).

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • What about just creating a method AB(i) as @Brandon suggested? I've gone down that route. – SpiritBob Nov 09 '19 at 08:04
  • @SpiritBob Brandon's suggestion is dangerous. If the method A has a synchronous part, this part will run synchronously N times in the current thread. – Theodor Zoulias Nov 09 '19 at 09:28
  • To put things into perspective, `Task.Run` has an overhead of around 2 μsec. In other words when you use it you buy safety at the cost of one second of CPU time every 500,000 invocations. – Theodor Zoulias Nov 09 '19 at 09:48
  • Thank you Theodor! My biggest concern when using `Task.Run` isn't performance issues, but rather starving the Thread pool. Because it's in a loop where N might be any possible number (not very large though), which is found in an ASP.NET application, and I know there that having no threads free means unable to service other requests. Even if we assume that the methods have very tiny initialization in the form of a synchronously written code, like a simple initialization of a `HttpRequestMessage`, or preparations to the I/O call, would going down the Task.Run route prove to be troublesome? – SpiritBob Nov 11 '19 at 09:06
  • Please factor in that in reality many people will be hitting the endpoint and reaching this piece of code sooner or later. We have no UI thread here, so the question is whether it's okay to use `Task.Run` in these circumstances. – SpiritBob Nov 11 '19 at 09:09
  • 1
    As much as I can tell `await Task.Run(async` doesn't offer any value for ASP.NET applications since, as Gabriel Luci pointed out, in ASP.NET it doesn't matter which thread you do the work on. The overhead of using it is minuscule, and should make no noticeable difference in the thread-pool availability, but since it offers no advantages I agree that you should avoid it. Not only for saving these 2 microseconds per invocation, but also for keeping your code clean and streamlined. – Theodor Zoulias Nov 11 '19 at 09:48