By default a Parallel.ForEach
loop uses threads from the ThreadPool
, which is a static class and it's only one per process. It is possible to modify this behavior by configuring the TaskScheduler
property of the ParallelOptions
. Creating a custom TaskScheduler
that functions as an alternative ThreadPool
is not exactly trivial, but not rocket science either. An implementation can be found here. If you want to learn more about custom task schedulers, you can read this article by Stephen Toub (code).
Now what happens when two parallel loops are running concurrently, is that both are scheduling work on ThreadPool
threads. If they are both configured with a specific MaxDegreeOfParallelism
, and the sum of both does not exceed the minimum number of threads that the ThreadPool
creates on demand¹, then the two loops are not going to interfere with each other regarding their scheduling. Of course it is still possible to compete with each other for CPU resources, in case these are scarce. In that case the operating system will be the arbiter.
In case at least one of the parallel loops is not configured with a specific MaxDegreeOfParallelism
, the effective default of this option is -1
, which means unbounded parallelism. This will cause the ThreadPool
to become immediately saturated, and to remain saturated until the source enumerable of the unconfigured parallel loop completes. During this period the two parallel loops will interfere heavily with each other, and who is going to get the extra thread that the saturated ThreadPool
will inject every ~1,000 msec is a matter of who asked for it first. On top of that a saturated ThreadPool
affects negatively any other independent callbacks, timer events, async continuations etc that may also be active during this period.
In case both parallel loops are configured, and the sum MaxDegreeOfParallelism
of both exceeds the number of available threads, then it's a similar situation as previous. The only difference is that gradually the number of threads in the ThreadPool
will increase, and the saturation incident may end up earlier than the execution of the parallel loops.
Below is an example that demonstrates this behavior:
ThreadPool.SetMinThreads(4, 4);
Task[] tasks = new[] { 'A', 'B' }.Select(name => Task.Run(() =>
{
Thread.Sleep(100); if (name == 'B') Thread.Sleep(500);
Print($"{name}-Starting");
var options = new ParallelOptions() { MaxDegreeOfParallelism = 10 };
Parallel.ForEach(Enumerable.Range(1, 10), options, item =>
{
Print($"{name}-Processing #{item}");
Thread.Sleep(1000);
});
Print($"{name}-Finished");
})).ToArray();
Task.WaitAll(tasks);
static void Print(string line)
{
Console.WriteLine($@"{DateTime.Now:HH:mm:ss.fff} [{Thread.CurrentThread
.ManagedThreadId}] > {line}");
}
Output:
15:34:20.054 [4] > A-Starting
15:34:20.133 [6] > A-Processing #2
15:34:20.133 [7] > A-Processing #3
15:34:20.133 [4] > A-Processing #1
15:34:20.552 [5] > B-Starting
15:34:20.553 [5] > B-Processing #1
15:34:20.956 [8] > A-Processing #4
15:34:21.133 [4] > A-Processing #5
15:34:21.133 [7] > A-Processing #6
15:34:21.133 [6] > A-Processing #7
15:34:21.553 [5] > B-Processing #2
15:34:21.957 [8] > A-Processing #8
15:34:21.957 [9] > A-Processing #9
15:34:22.133 [4] > A-Processing #10
15:34:22.134 [7] > B-Processing #3
15:34:22.134 [6] > B-Processing #4
15:34:22.553 [5] > B-Processing #5
15:34:22.957 [8] > B-Processing #6
15:34:22.958 [9] > B-Processing #7
15:34:23.134 [4] > A-Finished
15:34:23.134 [4] > B-Processing #8
15:34:23.135 [7] > B-Processing #9
15:34:23.135 [6] > B-Processing #10
15:34:24.135 [5] > B-Finished
(Try it on Fiddle)
You can see that the parallel loop A utilizes initially 3 threads (the threads 4, 6 and 7), while the parallel loop B utilizes only the thread 5. At that point the ThreadPool
is saturated. Around 500 msec later the new thread 8 is injected, and is taken by the A loop. The B loop still has only one thread. Another second later one more thread, the thread 9, is injected. This too goes for the loop A, setting the score at 5-1 in favor of the loop A. There is no politeness or courtesy in this battle. It's a wild competition for limited resources. If you expect to have more than one parallel loops running in parallel, make sure that all have their MaxDegreeOfParallelism
option configured, and that the ThreadPool
can create enough threads on demand to accommodate all of them.
¹ It is configured by the method ThreadPool.SetMinThreads
, and AFAIK by default is equal to Environment.ProcessorCount
.
Note: The above text describes the existing behavior of the static Parallel
class (.NET 5). Parallelism achieved through PLINQ (the AsParallel
LINQ operator) has not the same behavior in all aspects. Also in the future the Parallel
class may get new methods with different defaults.
.NET 6 update: The above example produces now a different output. The score ends up being only 3-2 in favor of the loop A:
04:34:47.894 [4] > A-Starting
04:34:47.926 [8] > A-Processing #1
04:34:47.926 [7] > A-Processing #2
04:34:47.926 [4] > A-Processing #3
04:34:48.392 [6] > B-Starting
04:34:48.393 [6] > B-Processing #1
04:34:48.792 [9] > B-Processing #2
04:34:48.927 [4] > A-Processing #4
04:34:48.927 [8] > A-Processing #5
04:34:48.927 [7] > A-Processing #6
04:34:49.393 [6] > B-Processing #3
04:34:49.792 [9] > B-Processing #4
04:34:49.927 [4] > A-Processing #7
04:34:49.927 [8] > A-Processing #8
04:34:49.928 [7] > A-Processing #9
04:34:50.393 [6] > B-Processing #5
04:34:50.792 [9] > B-Processing #6
04:34:50.927 [4] > A-Processing #10
04:34:50.928 [8] > B-Processing #8
04:34:50.928 [7] > B-Processing #7
04:34:51.393 [6] > B-Processing #9
04:34:51.928 [4] > A-Finished
04:34:52.393 [6] > B-Processing #10
04:34:53.394 [6] > B-Finished
The injected thread 9 is taken by the loop B instead of the loop A. It seems that the behavior of the Parallel
class, or the ThreadPool
, or both,
has changed slightly in .NET 6. But I am not sure what exactly the changes are.