2

I am reviewing some code and trying to come up with a technical reason why you should or should not use Task.WhenAll(Tasks[]) for essentially making Http calls in parallel. The Http calls call a different microservice and I guess one of the calls may or may not take some time to execute... (I guess I am not really interested in that). I'm using BenchmarkDotNet to give me an idea of there is any more memory consumed, or if execution time is wildly different. Here is an over-simplified example of the Benchmarks:

[Benchmark]
public async Task<string> Task_WhenAll_Benchmark()
{
  var t1 = Task1();
  var t2 = Task2();

  await Task.WhenAll(t1, t2);

  return $"{t1.Result}===={t2.Result}";
}

[Benchmark]
public async Task<string> Task_KeepItSimple_Benchmark()
{
  return $"{await Task1()}===={await Task2()}";
}

Task1 and Task2 methods are really simple (I have a static HttpClient in the class)

public async Task<string> Task1()
{
  using (var request = await httpClient.GetAsync("http://localhost:8000/1.txt"))
  {
    return $"task{await request.Content.ReadAsStringAsync()}";
  }
}

public async Task<string> Task2()
{
  using (var request = await httpClient.GetAsync("http://localhost:8000/2.txt"))
  {
    return $"task{await request.Content.ReadAsStringAsync()}";
  }
}

And my results

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Task_WhenAll_Benchmark 1.138 ms 0.0561 ms 0.1601 ms - - - 64 KB
Task_KeepItSimple_Benchmark 1.461 ms 0.0822 ms 0.2331 ms - - - 64 KB

As you can see, memory not an issue and there isn't a great deal of time in the execution either.

My question really is, is there a technical reason why you should or not use Task.WhenAll()? Is it just a preference?

I came across Async Guidance from a guy on the .net core team but it hasn't really covered this scenario.

Edit: this is .net framework (4.6.1) rather than core!

Edit 2: Update one of the benchmarks as suggested in a comment below.

I updated the benchmark for the KeepItSimple approach...

[Benchmark]
public async Task<string> Task_KeepItSimple_Benchmark()
{
  var t1 = Task1();
  var t2 = Task2();

  return $"{await t1}===={await t2}";
}

and have the following results:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Task_WhenAll_Benchmark 1.134 ms 0.0566 ms 0.1613 ms - - - 64 KB
Task_KeepItSimple_Benchmark 1.081 ms 0.0377 ms 0.1070 ms - - - 64 KB

And now I am even more confused - how is the execution faster (albeit a tiny amount!)? I thought execution of the code started when you await the result...

Matt
  • 2,691
  • 3
  • 22
  • 36
  • There should not be a huge difference between using `Task.WhenAll(Tasks[])` or `await t1`; then `await t2`; for only two tasks. If you `await t1`, and `t2` finishes before `t1`, then `await t2` will just encounter a completed task. The difference is more noticeable when you increase the number of tasks. – V.Lorz May 12 '21 at 15:58
  • 1
    Shouldn't `Task_KeepItSimple_Benchmark` match the `var t1 = Task1();` `var t2 = Task2();` from `Task_WhenAll_Benchmark` to be a fair comparison? In `Task_KeepItSimple_Benchmark`, you are awaiting all of task1 before task2 can start. – Connor Low May 12 '21 at 16:32
  • Agreed, the keep it simple one currently executes them sequentially so it's not really a valid comparison – ADyson May 12 '21 at 17:04
  • 2
    `I thought execution of the code started when you await the result` - not at all. The [method begins executing when it is called](https://blog.stephencleary.com/2012/02/async-and-await.html). The `await` (or "asynchronous wait") is when the calling code (asynchronously) waits for the method to complete. – Stephen Cleary May 13 '21 at 10:15
  • 2
    `how is the execution faster`...the quantities of time and the gap between them are so miniscule that a slight variation in network conditions, or server load, could easily cause that, even though you're using a local server. You'd have to test it dozens of times to begin to get any kind of meaningful average. – ADyson May 13 '21 at 10:19

3 Answers3

11

My question really is, is there a technical reason why you should or not use Task.WhenAll()?

The behavior is just slightly different in the case of exceptions when both calls fail. If they're awaited one at a time, the second failure will never be observed; the exception from the first failure is propagated immediately. If using Task.WhenAll, both failures are observed; one exception is propagated after both tasks fail.

Is it just a preference?

It's mostly just preference. I tend to prefer WhenAll because the code is more explicit, but I don't have a problem with awaiting one at a time.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Does `return $"{await Task1()}===={await Task2()}";` cause them to be executed in parallel or sequentially? – ADyson May 12 '21 at 15:54
  • @ADyson In your comment above, *sequentially*, but with `Task.WhenAll`, the waits will happen in parallel. With HTTP requests to a single host running sequentially, most of the overhead of setting up a connection will be dealt with by the first request. Be aware that you can't "scale up" indefinitely by going parallel, and the requests will start queueing when you reach other limits. – spender May 12 '21 at 16:01
  • 2
    @ADyson: Good point; they would be sequential then. My answer would be correct if the sample code was `return $"{await task1}===={await task2}";` – Stephen Cleary May 12 '21 at 16:04
  • As I put in my question, it is an over simplification of what the actual code is doing. In reality, the results are stored in variables and used further in the execution of the code (hence me demonstrating that in my return I either `await` or `.Result` the async call) – Matt May 13 '21 at 07:46
3

Task.WhenAll is just a method that creates a task that will complete when all of the supplied tasks have completed. That's it. There are some prons and cons, but the general idea behind it is to be able to manage lots of tasks easily.

When you start your task it gets scheduled by your TaskScheduler no matter if you use it with or without the discussed method.

The benefit is that if you generate your tasks in a loop, for example, you should not manage each and every one of them. e.g.

var i = 0;
var tasks = new List<Task>();
while(i++ < 10)
{
    var task = // start a task here
    tasks.Add(task);
}
var result = await Task.WhenAll(tasks);

You should not await every and each of these and if there's an exception thrown by any of the underlying tasks, it'll be rethrown.

That's where we face a certain complication, as when your exception is being rethrown, it gets wrapped into AggregateException, so in order to get an original problem you'll have to iterate through InnerExceptions property of it. However, since await keyword unwraps AggregateException, you'll get only top exception when awaiting WhenAll and won't see if there are other issues. So to get AggregateException, you'll have to do it an old way, with callbacks. But this is another story.

Btw, in order your second benchmark to be more close to what you have with WhenAll method, you should refactor it into this:

[Benchmark]
public async Task<string> Task_KeepItSimple_Benchmark()
{
  var t1 = Task1();
  var t2 = Task2();

  return $"{await t1}===={await t2}";
}
Gleb
  • 1,723
  • 1
  • 11
  • 24
0

The answer is: use it when you need it. It is different than just awaiting all tasks one after the other although they still run async. You should catch AggregateException in the WhenAll case. If you have to make a foreach loop and you intend to use await inside - the better control is schedule the tasks and use WhenAll. The topics that should cover this scenario are TPL, TaskScheduler, PLINQ and Parallel Extensions. It's all about running a certain or not known high number of tasks asynchronously with a certain parallelism degree. Your example is like a ParallelForEach of all urls.

Some advanced and universal examples: https://stackoverflow.com/a/39174881/1025264

https://stackoverflow.com/a/57519218/1025264

TPL variant: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.dataflow.actionblock-1.-ctor?redirectedfrom=MSDN&view=net-5.0#System_Threading_Tasks_Dataflow_ActionBlock_1__ctor_System_Action__0__System_Threading_Tasks_Dataflow_ExecutionDataflowBlockOptions_

vezenkov
  • 4,009
  • 1
  • 26
  • 27