Task.Run to increase parallelism of IO-bound operations?

Question

I'm getting a bit confused with Task.Run and all I read about it on the internet. So here's my case: I have some function that handles incoming socket data:

public async Task Handle(Client client)
{
    while (true)
    {
        var data = await client.ReadAsync();
        await this.ProcessData(client, data);
    }
}

but this has a disadvantage that I can only read next request once I've finished processing the last one. So here's a modified version:

public async Task Handle(Client client)
{
    while (true)
    {
        var data = await client.ReadAsync();
        Task.Run(async () => {
            await this.ProcessData(client, data);
        });            
    }
}

It's a simplified version. For more advanced one I would restrict the maximum amount of parallel requests of course.

Anyway this ProcessData is mostly IO-bound (doing some calls to dbs, very light processing and sending data back to client) yet I keep reading that I should use Task.Run with CPU-bound functions.

Is that a correct usage of Task.Run for my case? If not what would be an alternative?

@DanielA.White It is. But next `ReadAsync()` call has to wait for previous `ProcessData` (asynchronous or not) to finish (due to `await`). For example if I send requests A,B,C, then at the moment total processing time is t(A)+t(B)+t(C). If I were to run them in background I could change that to max(t(A), t(B), t(C)). The second version does that. But I'm not sure if it is correct. — freakish, Jan 24 '18 at 15:50
@freakish You have no reason to use `Task.Run` to do that. See [this answer](https://stackoverflow.com/a/23833635/) for an explanation on the difference between multithreading and parallelism. — Servy, Jan 24 '18 at 15:53
@Servy How can I run an asynchronous function in background without `Task.Run`? The question is not about threads. I don't see how the linked post is relevant. If you have an answer the feel free to post it. — freakish, Jan 24 '18 at 15:55
@freakish You don't need to run an asynchronous function in a background thread *at all*, and thus you don't need to use `Task.Run` because it exists purely to run a synchronous method in a background thread (which you don't need to do). Your question *is* about threads, and apparently about parallelism. It's about threading because `Task.Run` exists to run a method in another thread, and it's about parallelism because you want to have three operations run in parallel. You can see my other answer for an explanation of why you don't need threads at all to run the operations in parallel. — Servy, Jan 24 '18 at 15:59
@Servy That really doesn't answer the comment, though. The question is basically "how would I best make this function run so that the next client can be consumed (and not wait until this client has finished processing before continuing)?" To which the OP best answered with "use Task.Run". — Camilo Terevinto, Jan 24 '18 at 16:02
@CamiloTerevinto `Task.Run` does nothing to accomplish that though, so no, that's *not* how the problem is best answered. — Servy, Jan 24 '18 at 16:03
@Servy Oh, but it does. The second code **does not await** the `Task.Run()` result. Hence I can `ReadAsync()` again without waiting for `ProcessData`. — freakish, Jan 24 '18 at 16:04
@freakish You don't need to call `Task.Run` to not await a task. If you don't want to await a `Task` then *don't await the `Task`*. Wrapping it in a call to `Task.Run` doesn't change anything. — Servy, Jan 24 '18 at 16:07
@Servy Hmm, so a not-awaited task will still execute in the background? I didn't know that. That seems quite counterintuitive (or maybe I'm biased since it doesn't work like that in Python). In that case I'm looking forward to more details and I encourage you to post an answer. — freakish, Jan 24 '18 at 16:16
@freakish It doesn't (necessarily) run in a background thread, no. It runs asynchronously (which is of course true by definition, because it's asynchronous). That the method returns before the work is completed, and that the work will be done at some point in the future without your interaction *is what it means for an operation to be asynchronous*. — Servy, Jan 24 '18 at 16:19
@Servy yeah, yeah. Thats what I mean by background (I deliberately avoid the word thread). I've tried your suggestion and it seems to work fine. As a side question: will the framework (asp.net core in my case) take care of utilizing multiple cpu cores without $Task.Run$ for multiple requests for one (websocket) client? Not that it matters, just curious. — freakish, Jan 24 '18 at 16:28
@freakish Different asynchronous operations can accomplish their asynchrony in any number of ways. They could use threads or they could *not need any threads at all to do their work*, as is the case for IO bound work. — Servy, Jan 24 '18 at 16:30

score 7 · Accepted Answer · answered Jan 24 '18 at 16:05

7

Conceptually, that is a fine usage of Task.Run. It's very similar to how ASP.NET dispatches requests: (asynchronously) reading a request and then dispatching (synchronous or asynchronous) work to the thread pool.

In practice, you'll want to ensure that the result of ProcessData is handled properly. In particular, you'll want to observe exceptions. As the code currently stands, any exceptions from ProcessData will be swallowed, since the task returned from Task.Run is not observed.

IMO, the cleanest way to handle per-message errors is to have your own try/catch, as such:

public async Task Handle(Client client)
{
  while (true)
  {
    var data = await client.ReadAsync();
    Task.Run(async () => {
      try { await this.ProcessData(client, data); }
      catch (Exception ex) {
        // TODO: handle
      }
    });            
  }
}

where the // TODO: handle is the appropriate error-handling code for your application. E.g., you might send an error response on the socket, or just log-and-ignore.

answered Jan 24 '18 at 16:05

Stephen Cleary

437,863
77
675
810

I understand its a bad idea to wrap IO operations inside `Task.Run()`. With that in mind, suppose we have multiple IO operations (DB or WCF calls which don't have `async` apis) in an ASP.NET application and they are awaited using `Task.WhenAll(task1_DB, task2_WCF, task3_DB...)`. Please note that all tasks are IO-bound. Do you recommend using `Task.Run()`? If so, how? – NrN Mar 22 '18 at 04:50
@NrN: Not if you can avoid it. The best solution is to make your DB and WCF calls asynchronous. `Task.Run` would consume extra thread pool threads for each request, which would negatively impact your scalability. – Stephen Cleary Mar 22 '18 at 05:02
Thanks for the reply. Is it fine for an MVC controller not being `async` if the underlying operations such as DB or WCF are not truly `async`. Can we leave the controller action method synchronous? Or should we wrap the result with `Task.FromResult` in order to make the action method `async`? Also, what would you suggest `await`ing `Task.FromResult` – NrN Mar 22 '18 at 05:37
@NrN: If all your underlying operations are synchronous, then your API endpoint should be synchronous. – Stephen Cleary Mar 22 '18 at 14:17
@StephenCleary, I'm i bit confused by this answer. You say that if the IO-bound calls are truly async then calling Task.Run would consume extra thread pool threads. Then why is it a good idea to use Task.Run to wrap asynchronous IO operations in this case? – Edminsson Dec 28 '20 at 11:37
@Edminsson: In this case, OP is essentially writing the ASP.NET framework. Pushing each request to a thread pool thread allows the main loop to continue reading requests as quickly as possible. If you *already have* a framework that is *already doing this* (i.e., ASP.NET), then you definitely want to avoid `Task.Run` in your own code. – Stephen Cleary Dec 28 '20 at 15:42

Task.Run to increase parallelism of IO-bound operations?

1 Answers1