9

I'm doing heavy mathematical computations using Math.Net Numerics parallely inside Parallel.For block.

When I run code in my local system with 4 cores(2*2), it's using all 4 cores.

But when I run same code in our dev server with 8 cores(4*2), it's using only 4 cores.

I've tried setting MaxDegreeOfParallism,but couldn't help.

Any idea why all cores are not being utilised.

Below is sample code.

Parallel.For(0,10000,(i)=>
{

 // heavy math computations using matrices
});
malkam
  • 2,337
  • 1
  • 14
  • 17
  • 1
    Does your server have 8 cores or 4 cores that support hyper threading? – vcsjones Aug 24 '15 at 17:36
  • @vcsjones: 4 CPUs and each has 2 cores so total 4*2=8 cores – malkam Aug 24 '15 at 17:38
  • Are you using a native provider? Note that Linear Algebra in Math.NET Numerics is itself parallelized (at least in parts) - if you prefer to do your own parallelization on top, consider to disable Math.NET's parallelization by calling `Control.UseSingleThread();` – Christoph Rüegg Aug 24 '15 at 17:59
  • @ChristophRüegg I'm using Intel MKL Provider. I'm performing large number of matrices operations in each iteration. – malkam Aug 24 '15 at 18:08
  • Show the actual code being run in that for. – i3arnon Aug 24 '15 at 18:14
  • @malkam assuming the matrices are dense - how large are they? How many cores do get used if you run the same sequentially (i.e. without using Parallel.For)? – Christoph Rüegg Aug 24 '15 at 18:14
  • @ChristophRüegg: Average matrices size might be 25*25. If I run sequentially single core used. I'm performing around 200+ operations(add/substract/div/mulitply/transpose). Not sure why it's not utilising all cores in Dev system when it's working fine in my local system. – malkam Aug 24 '15 at 18:18
  • Please clarify again, what are the number of **physical** CPU's in your workstation machine and on your server. You said you have "2*2" on your dev machine. I find it unlikely your dev machine has multiple processor sockets on its motherboard, it is rare to see it outside of a server environment. Please clarify how many Processors, Cores, and threads your dev and production can handle by running `msinfo32` on each machine and copy the information in the `Processor` fields in to your question. – Scott Chamberlain Aug 24 '15 at 18:28
  • If it really is multiple physical processors, this question may be helpful to you: [Unable to use more than one processor group for my threads in a C# app](http://stackoverflow.com/questions/28098082/unable-to-use-more-than-one-processor-group-for-my-threads-in-a-c-sharp-app) – Scott Chamberlain Aug 24 '15 at 18:30
  • @ScottChamberlain : In local machine - 2 physical CPUs and 2 cores for each pysical CPU.So total 4 cores. In Dev machine - 4 physical CPUs and 2 cores for each physical CPU.So total 8 cores. – malkam Aug 24 '15 at 18:33

2 Answers2

1

From MSDN

By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.

The way I read the documentation: if the underlying scheduler only offers a single thread, then setting MaxDegreeOfParallelism > 1 will still result in a single thread.

theB
  • 6,450
  • 1
  • 28
  • 38
  • Multiple threads/tasks have been created.But they have been running only on 4 cores when 8 cores are available. – malkam Aug 24 '15 at 18:00
  • The pool's scheduler and the runtime/TPL are actually deciding which threads run where and when. If the runtime only schedules onto 4 logical cores, then that's the number that will run concurrently, regardless of the number of threads that have been created. – theB Aug 24 '15 at 18:45
1

Parallelization is done runtime, based on the current conditions and a lots of other circumstances. You cannot force .NET to use all the cores (in managed code at least).

From MSDN:

Conversely, by default, the Parallel.ForEach and Parallel.For methods can use a variable number of tasks. That's why, for example, the ParallelOptions class has a MaxDegreeOfParallelism property instead of a "MinDegreeOfParallelism" property. The idea is that the system can use fewer threads than requested to process a loop.

The .NET thread pool adapts dynamically to changing workloads by allowing the number of worker threads for parallel tasks to change over time. At run time, the system observes whether increasing the number of threads improves or degrades overall throughput and adjusts the number of worker threads accordingly.

Be careful if you use parallel loops with individual steps that take several seconds or more. This can occur with I/O-bound workloads as well as lengthy calculations. If the loops take a long time, you may experience an unbounded growth of worker threads due to a heuristic for preventing thread starvation that's used by the .NET ThreadPool class's thread injection logic.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Dexion
  • 1,101
  • 8
  • 14