Factors for determining the degree of parallelism for the ForEachAsync

Question

Below is an implementation of ForEachAsync written by Stephen Toub.

public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop,
    Func<T, Task> body) 
{ 
    return Task.WhenAll( 
        from partition in Partitioner.Create(source).GetPartitions(dop) 
        select Task.Run(async delegate { 
            using (partition) 
                while (partition.MoveNext()) 
                    await body(partition.Current); 
        })); 
}

What factors should be considered when specifying a partitionCount (dop in this case)?

Does the hardware make a difference (# of cores, available RAM, etc)?

Does the type of data/operation influence the count?

My first guess would be to set dop equal to Environment.ProcessorCount for general cases, but my gut tells me that's probably unrelated.

score 4 · Accepted Answer · answered Dec 18 '15 at 19:58

4

Both hardware as well as operations executed matter a lot.

If you want to run CPU bound work that is not constrained in any other way you don't need to method at all. You're better off using Parallel or PLINQ which are made for that (and suck terribly at IO).

For IO there is no easy way to predict the best DOP. For example, magnetic disks like DOP 1. SSDs like 4-16(?). Web services could like pretty much any value. I could continue this list for dozens more factors including databases, lock contention etc.

You need to test different amounts in a testing environment. Then, use the best performing value.

Using Environment.ProcessorCount makes no sense with IO. When you add CPUs IO does not get faster.

answered Dec 18 '15 at 19:58

usr

168,620
35
240
369

Your argument against using it for IO makes sense. But why (specifically) is `Parallel` or `PLINQ` better off than this extension using `async-await`? – Jim Buck Dec 18 '15 at 21:01
I think they have a different approach to spawning thread. This code would always spawn N while Parallel has provisions to spawn less if the thread pool is saturated. Not sure it matters much, though. In any case async is pointless with CPU bound work. – usr Dec 18 '15 at 21:03
As I read your responses it seems like common sense. Thank you for explaining the use cases for each type of problem: CPU vs. IO vs. External(think HTTP requests). – Jim Buck Dec 18 '15 at 21:10

Theodor Zoulias · Answer 2 · 2023-02-13T07:05:29.890

Starting from .NET 6, the Parallel.ForEachAsync method is now part of the standard libraries. The default MaxDegreeOfParallelism for this method is Environment.ProcessorCount. This doesn't mean that the number of CPU cores is the optimal concurrency limit for the majority of asynchronous scenarios. It only means that any positive number is preferable to -1 (unlimited), which is the default MaxDegreeOfParallelism for the non-asynchronous Parallel APIs. An unlimited default parallelism would cause lots of unintentional DoS attacks, by developers who would just experiment with the API. Selecting a constant value, like 10, was probably deemed too arbitrary, so they opted for the Environment.ProcessorCount. It makes sense. It is logical to assume that there is some correlation between the power of the machine, and the power of the network that is connected to.

My suggestion for configuring the Parallel.ForEachAsync/MaxDegreeOfParallelism is to not rely on the default, experiment with various values, start with a small value like 2, be conservative, and consider making it configurable manually through the App.config. The optimal value might change during the lifetime of the application.

It should be noted that the .NET 6 Parallel.ForEachAsync method has not identical behavior with the one-liner ForEachAsync that is shown in the question. The most important difference is that in case of an exception the Parallel.ForEachAsync will stop invoking the body, and will complete ASAP as faulted. On the contrary the one-liner will continue invoking the body, as long as there is a worker task still alive. Each error will kill one of the dop workers. If you are unlucky to have exactly dop - 1 early exceptions, the last standing worker will slowly process the remaining elements alone, until the exceptions are finally surfaced. For implementations with better behavior on .NET platforms older than .NET 6, look at this question, or this.

score -1 · Answer 3 · answered Aug 26 '18 at 06:11

Values of DOP affect the hardware. You can give any feasible value to DOP.

The values you give that much parallel code process at a time.

For example, If you have DataSet having 1000 rows in you need to process each rows and perform certain operation on it. Now if your DOP value is 50 then at a time 50 rows will process in parallel.

Factors for determining the degree of parallelism for the ForEachAsync

3 Answers3

Linked