2

I'm using Parallel.For to improve the performance of some long running code, which I've simplified here.

Increasing the MaxDegreeOfParallelism from 1 to 2 cuts the time to complete to about half, but increasing from 2 to 3 cuts the time by only a quarter. Any subsequent increases improves the performance even less. However, the CPU % seen in Task Manager keeps increasing linearly, which means that the CPU is working harder and harder for less performance gains the higher the MaxDegreeOfParallelism is.

How is that possible? What is the CPU spending the cycles on? Can anything be done with the code improve performance at higher parallel degrees?

using System.Diagnostics;
float sum = 0f;
Stopwatch sw = Stopwatch.StartNew();

Parallel.For(0, 1000, new ParallelOptions { MaxDegreeOfParallelism = -1 }, (iter) =>
{
    float answer = 0f;
    for (int i = 0; i < 100000; i++)
    {
        for (int j = 0; j < 100; j++)
            answer += MathF.Sqrt(i * j);
    }
    sum += answer;
});

Console.WriteLine($"{sw.ElapsedMilliseconds} {sum}");

The below tests were done in Release mode, on a quad core CPU with 8 logical cores.

Parallel performance

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
jrw
  • 65
  • 1
  • 8
  • Have a look at https://benchmarkdotnet.org/ for benchmarks you can count on. After that and presenting the results it is worth to think about – Sir Rufo Feb 25 '23 at 18:29
  • As discussed multiple times here: Using parallel computing takes overhead. Sqrt is not expensive enough to make up for this. – TaW Feb 25 '23 at 19:43
  • 2
    A PC is a multi-task operating system. Tasks are not run to completion but get swapped with other operating system tasks. The swapping takes overhead like TaW stated. The amount of time swapping will take depends on the amount of memory on the machine, the number of Cores in the micro as well as other factors. If you are getting linear gains it probably means each parallel task is getting put into a different Core so they are executing at the same time. Once your degree of parallelism exceeds the number of Core you probably get no additional gain. – jdweng Feb 25 '23 at 21:57
  • @jdweng Is it possible to prevent this swapping action from happening? For example, by telling Windows I want to reserve 4 logical cores fully dedicated to my application, and let Windows keep the other 4 for OS tasks? – jrw Feb 26 '23 at 06:17
  • @jdweng *"I/O bound tasks are often the best candidates for parallelism."* -- I assume that you are using the term "parallelism" loosely, instead of the correct term: concurrency. Parallelism is when tasks literally run at the same time, on a multicore processor, not just when they start, run, and complete in overlapping time periods. See this question: [What is the difference between concurrency and parallelism?](https://stackoverflow.com/questions/1050222/what-is-the-difference-between-concurrency-and-parallelism) – Theodor Zoulias Feb 26 '23 at 08:14
  • 1
    It might not be relevant to the core issue, but the unsynchronized `sum += answer;` line could result in incorrect final `sum`. The [`Interlocked.Add`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.interlocked.add) doesn't support the `float` type, so using a `lock` is probably the simplest way to ensure the correctness of the calculation. – Theodor Zoulias Feb 26 '23 at 08:49
  • Only by setting priority of the tasks which will not dedicate a core to a task. – jdweng Feb 26 '23 at 08:56

0 Answers0