I recently noticed that LINQ's AsParallel
only utilizes half the cores on one of our soon to be production systems. So I wrote a tiny benchmark in LINQPad:
// should terminate with an AggregatException containing t overflows
// where t is the number of threads used
Enumerable.Range(0,Int32.MaxValue).AsParallel().Select(i=>
Enumerable.Range(0,i).Select(j=>(long)j).Sum()
+Enumerable.Range(0,Int32.MaxValue-i).Select(k=>(long)k).Sum()
).Sum()
Running this piece of code on a 2 socket (16C/32T) system only shows 16 busy cores (and finishes with 16 overflows for that matter), which leads me to the initial Question: Is PLINQ limited a one Socket/NUMA Node? Or what is the limiting factor here?
Update
Environment.ProcessorCount.Dump("detected processor count");
is 16 instead of 32 so that may be the cause for PLINQ using only 16 worker threads in the first place. But then again, when explicitly using .WithDegreeOfParallelism(32)
also only 16 cores are used.
It looks like we're also affected by the HP ProLiant Gen9 / Xeon E5 2600v3 Kgroup Clustred vs Flat configuration.. If you have 64 or less cores, use flat
and you can utilize all cores again. However, if you have more cores you there seems to be no solution at the moment.
The question whether PLINQ is limited to a NUMA Node still remains. My guess is: It doesn't care about NUMA nodes and it only uses the first kernel group.