6

Say I want to make parallel API post requests.

In a for loop I can append the http post call into a list of tasks, (each task invoked using Task.Run) and then wait for all to finish using await Task.WhenAll. Thus the control will go to caller while waiting for the network request to complete. Effectively the API request will be made in parallel.

Similarly I can use Parallel.ForEachAsync which will automatically do the WhenAll and return control to caller. So I want to ask whether ForEachAsync is a replacement to a plain for loop list (async await Task.Run) and WhenAll?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
variable
  • 8,262
  • 9
  • 95
  • 215
  • 2
    No, it's not. `Parallel.ForEach` does a *lot* more than just use multiple tasks. - it partitions the data so that each worker task won't have to synchronize with others to access the data. Then it uses as many workers as there are cores to process those partitions. There's little point in starting 100 workers if there are only 4 cores. The other 96 workers will simply do nothing except add to the scheduling overhead – Panagiotis Kanavos Jul 27 '21 at 11:57
  • `which will automatically do the WaitAll` that's not what happens. `Parallel` will use the current thread to process data, and since all cores are busy crunching data, it appears as if the thread is "blocked". It's not – Panagiotis Kanavos Jul 27 '21 at 11:58
  • A threading equivalent to `Parallel.ForEachAsync` is an `ActionBlock` with a DOP equal to the number of cores, using an async lambda. Even then , an ActionBlock doesn't deal with partitioning, nor does it dynamically alter the number of workers, or handle load balancing the way `Parallel.ForEach` does – Panagiotis Kanavos Jul 27 '21 at 12:01
  • 2
    In fact, an ActionBlock would be a *lot* better than a loop and WaitAll. With an ActionBlock you can limit the number of concurrent connections easily. Neither servers nor clients have infinite bandwidth or CPU, so trying to send 100 HTTP requests concurrently can easily be *slower* than making just 10 at a time – Panagiotis Kanavos Jul 27 '21 at 12:03
  • 1
    PS: `ForEachAsync` returns a `Task`, so it behaves as if you called `WhenAll`, not `WaitAll` – Panagiotis Kanavos Jul 27 '21 at 12:04
  • 1
    I found the [Github issue where ForEachAsync was discussed](https://github.com/dotnet/runtime/issues/1946) and it sounds like partitioning is *not* used. `ForEach` and `ForEachAsync` pass state between workers and iterations though, something not possible with either a loop of tasks or `ActionBlock. And as [the source shows](https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/System.Threading.Tasks.Parallel/src/System/Threading/Tasks/Parallel.ForEachAsync.cs#L88) it doesn't just start some tasks. – Panagiotis Kanavos Jul 27 '21 at 12:21
  • So parallel for each used 1 core per task. Where as async await makes use of 1 thread per task? – variable Aug 21 '21 at 03:35
  • @PanagiotisKanavos you could consider moving your comments into a new answer. Comments are intended for asking for more information or suggesting improvements. Answering questions in comments [should be avoided](https://prnt.sc/RfK7kkKxohGr "Use comments to ask for more information or suggest improvements. Avoid answering questions in comments."). – Theodor Zoulias Jun 27 '22 at 02:54
  • @TheodorZoulias I know the distinction very well. In fac, given it's a year since I posted the comment, I don't think that's what this comment is about. – Panagiotis Kanavos Jun 27 '22 at 08:10

1 Answers1

11

No, the Parallel.ForEachAsync API has quite a lot of differences compared to a trivial use of the Task.WhenAll API:

  1. The elephant in the room: the await Task.WhenAll returns an array with the results of the asynchronous operations. On the contrary the Parallel.ForEachAsync returns a naked Task. If you want the results you must rely on side-effects, like updating a ConcurrentQueue<T> as part of the asynchronous operation.

  2. The Parallel.ForEachAsync invokes the supplied asynchronous delegate in parallel, on ThreadPool threads (configurable). On the contrary the common pattern of using the Task.WhenAll is to create the Tasks sequentially, on the current thread. This raises concerns about using the Parallel.ForEachAsync in ASP.NET applications, where offloading work on the ThreadPool might have scalability implications.

  3. The Parallel.ForEachAsync invokes the asynchronous delegate and awaits the generated tasks, while enforcing a maximum level of concurrency equal to Environment.ProcessorCount. This behavior is configurable through the MaxDegreeOfParallelism option. On the contrary the common pattern of using the Task.WhenAll is to create all the tasks at once, imposing no concurrency limitation.

  4. The common pattern of using the Task.WhenAll is to assume that creating all the tasks is impossible to fail midway, and so to take no precautions against this possibility. In case this actually happens, fire-and-forget tasks might be leaked. This is not possible with the Parallel.ForEachAsync API.

  5. The Parallel.ForEachAsync will stop invoking the asynchronous delegate as soon as the first error occurs on either an asynchronous delegate invocation, or a created Task, and then propagates a failure containing all the errors that have occurred so far, after awaiting all the already created tasks. It also provides a mechanism for canceling the other tasks that are in-flight when the error occurs (the CancellationToken that is passed as second argument in the lambda). On the contrary the Task.WhenAll waits invariably for all the tasks to complete. This means that you might have to wait for a lot longer, before eventually receiving an AggregateException containing the errors of all the tasks that have failed.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • I have posted [here](https://stackoverflow.com/questions/30907650/foreachasync-with-result/71129678#71129678 "ForEachAsync with Result") a `ForEachAsync` variant that returns results. – Theodor Zoulias Jun 27 '22 at 01:38
  • 1
    Hi, nice answer, just wondering do you know the performance difference between the two? I am more interested in speed. – Andy Song Jun 27 '22 at 21:33
  • the reason I ask is that https://stackoverflow.com/questions/72778189/how-to-improve-speed-for-multiple-npgsql-connections not sure which way to go, or do you have any ideas thanks. – Andy Song Jun 27 '22 at 21:39
  • 1
    @AndySong probably the `Parallel.ForEachAsync` as a mechanism has more overhead than the `Task.WhenAll`, but the difference should be in the scale of nanoseconds. The overhead of either mechanism is unlikely to have any noticeable effect in the performance of your app though. On the other hand the `Parallel.ForEachAsync` by being able to control the degree of parallelism might result in a smoother communication pattern with your database, resulting in big time performance improvements. If your database is happy, your app will be happy too! – Theodor Zoulias Jun 27 '22 at 21:53
  • 1
    "This raises concerns about using the `Parallel.ForEachAsync` in ASP.NET applications, where offloading work on the `ThreadPool` might have scalability implications." To confirm, you talking specifically about scenarios in which your delegate contains CPU-intensive code prior to `await`, right? Meaning using `ForEachAsync` would clog up multiple CPU cores while kicking off the collection of tasks, whereas you'd normally kick the tasks off one at a time and avoid that issue. – MarredCheese Oct 18 '22 at 15:47
  • @MarredCheese I am not talking about CPU-bound code. This is obviously bad in ASP.NET. Stephen Cleary has written in multiple occasions that `Task.Run` in ASP.NET is bad, even with async delegates, because it throws off the thread pool heuristics. If `Task.Run` is bad, `Parallel.ForEachAsync` should be even worse. I am a sceptic to this claim, but Stephen Cleary is an expert in this field. Links: [1](https://stackoverflow.com/a/70966172/11178549), [2](https://stackoverflow.com/a/40065172/11178549), [3](https://stackoverflow.com/questions/55202809/). – Theodor Zoulias Oct 19 '22 at 03:45