8

There is a C# function A(arg1, arg2) which needs to be called lots of times. To do this fastest, I am using parallel programming.

Take the example of the following code:

long totalCalls = 2000000;
int threads = Environment.ProcessorCount;

ParallelOptions options = new ParallelOptions(); 
options.MaxDegreeOfParallelism = threads;

Parallel.ForEach(Enumerable.Range(1, threads), options, range =>
{
    for (int i = 0; i < total / threads; i++)
    {
        // init arg1 and arg2
        var value = A(arg1, agr2);
        // do something with value
    }
});

Now the issue is that this is not scaling up with an increase in number of cores; e.g. on 8 cores it is using 80% of CPU and on 16 cores it is using 40-50% of CPU. I want to use the CPU to maximum extent.

You may assume A(arg1, arg2) internally contains a complex calculation, but it doesn't have any IO or network-bound operations, and also there is no thread locking. What are other possibilities to find out which part of the code is making it not perform in a 100% parallel manner?

I also tried increasing the degree of parallelism, e.g.

int threads = Environment.ProcessorCount * 2;
// AND
int threads = Environment.ProcessorCount * 4;
// etc.

But it was of no help.

Update 1 - if I run the same code by replacing A() with a simple function which is calculating prime number then it is utilizing 100 CPU and scaling up well. So this proves that other piece of code is correct. Now issue could be within the original function A(). I need a way to detect that issue which is causing some sort of sequencing.

Ramesh Soni
  • 15,867
  • 28
  • 93
  • 113
  • 3
    wouldn't you be better off using `Task`s not `Parallel.ForEach`? you can then control the tasks and their number much better. – Liam Jul 07 '16 at 10:48
  • 2
    which os? are you running a release build without vshost.exe? is the process the only one running when you measure CPU usage? process priority? – Cee McSharpface Jul 07 '16 at 10:49
  • @dlatikay - Windows Server 2012. Yes, I am running release build and apart from default OS features this is the only program which is running. I haven't set the priority. Let me try that too. – Ramesh Soni Jul 07 '16 at 10:53
  • @Liam The Tasks API (Task or Parallel.For(Each)) implements thread pooling. As such shouldn't it be the case that using `MaxDegreeOfParallelism` should be just as effective as using multiple tasks? – Alexei Barnes Jul 07 '16 at 10:53
  • @Liam - will try this out and update you with result. – Ramesh Soni Jul 07 '16 at 10:54
  • Please check update 1. – Ramesh Soni Jul 07 '16 at 11:03
  • @AlexeiBarnes Does it? I don't knonw for certain. If it was Task(s) though I would know for certain as I would have control of all of the processing. If a helper method (which this is) doesn't seem to be doing what you'd expect then go back to basics and roll your own. – Liam Jul 07 '16 at 12:13
  • @Liam That would be `Thread` not `Task`. All of these are just helpers for base windows threads, but `Thread` is the most base, `Task` does not create a thread necessarily. http://stackoverflow.com/questions/13429129/task-vs-thread-differences – Alexei Barnes Jul 07 '16 at 12:15
  • You update seems ot imply that the issue lies within A(). Without seeing A() its hard to help. My guess is that A() utilises some kind of shared resource that is hoisted when put into the lambda directly – Liam Jul 07 '16 at 12:15
  • it worked finally and the problem was not within `A()` rather it has to to something with GC. Setting up `` worked. Very good finding for anyone stuck in similar situation. Thanks to @usr – Ramesh Soni Jul 08 '16 at 04:29

1 Answers1

6

You have determined that the code in A is the problem.

There is one very common problem: Garbage collection. Configure your application in app.config to use the concurrent server GC. The Workstation GC tends to serialize execution. The effect is severe.

If this is not the problem pause the debugger a few times and look at the Debug -> Parallel Stacks window. There, you can see what your threads are doing. Look for common resources and contention. For example if you find many thread waiting for a lock that's your problem.

Another nice debugging technique is commenting out code. Once the scalability limit disappears you know what code caused it.

usr
  • 168,620
  • 35
  • 240
  • 369