3

My question is addressed to performance issues in multithreading at Windows environment. After testing my code I got results that increasing the number of threads do not increase the performance of parallel calculations and became less after some count. What is going on? Is it possible to find out the formula of optimal threads number: F(processors, memory..) = ?

Marcos Gonzalez
  • 1,086
  • 2
  • 12
  • 19
garik
  • 5,669
  • 5
  • 30
  • 42

6 Answers6

6

To begin with, since your CPU has a hardware limit on how many threads it can execute concurrently (e.g. 4 for a quad core, double that if it has HyperThreading) there is no way you can get better performance by creating more threads than you have cores. Additional threads will in fact reduce performance as you have seen because there is increased overhead regarding thread scheduling and synchronization while work done per unit of time remains the same.

The Task Parallel Library is a very good starting point if you want to let the runtime automatically manage some parameters for you -- and you can take explicit control if in the future you find there is a reason to do so.

Jon
  • 428,835
  • 81
  • 738
  • 806
4

It depends on what the threads are doing. If they are primarily CPU-bound, then the optimal number of threads is 1 per processor core. If they do any significant IO where they are waiting for response from the kernel, then more threads will increase performance.

There is context-switching overhead when you have more than one thread per core, so increasing the number of threads for CPU-bound calculations will always hurt performance.

Gerald
  • 23,011
  • 10
  • 73
  • 102
  • @garik, then it really depends more specifically on what you're doing. HyperThreaded virtual cores share cache and execution units so if your application makes efficient use of the CPU and memory then you're not going to get any benefit from using more threads. If the memory is fragmented and there are a lot of cache misses then you can get some performance boosts from using 2 threads per core. In some cases it is actually faster to disable HT and just run 1 thread per core. You'd just have to test to be sure. – Gerald Mar 01 '12 at 10:45
  • As an aside, if you have a semi-modern graphics adapter you can leverage the GPUs for paralleling processing to good effect with CUDA. Modern graphics adapters have hundreds of cores with multi-GB-per-second internal memory throughput. They have a limited instruction set, but for many floating-point calculations they are extremely fast. – Gerald Mar 01 '12 at 10:50
2

If you're looking for formulas, there's the Amdahl's law:

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of 1 hour cannot be parallelized, while the remaining promising portion of 19 hours (95%) can be parallelized, then regardless of how many processors we devote to a parallelized execution of this program, the minimum execution time cannot be less than that critical 1 hour.

Yuriy Guts
  • 2,180
  • 1
  • 14
  • 18
2

The reason why 10+ threads aren't necessarily faster than 3 is that there is an overhead associated with each thread. This is managing the threads themselves - making sure that each gets appropriate processing time, and managing the data passed between threads.

Therefore the more threads you have the greater this non processing overhead.

If you have a quad core processor then each of three threads could run 100% of the time on a core each (it won't be true, but it's an example). However, with 9 threads then each thread can only run 33% of the time on a core - it has to share it with 2 others. The overhead in managing this will mean that the 9 threads are actually slower than 3.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
1

You may take a look at the Task Parallel Library in .NET 4.0. And if you are running an older version of the framework you could use the thread pool to avoid the manual thread creation overhead.

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
1

It's hard to give a precise general rule. Usually, more threads than cores make sense if you're often waiting (for I/O or so). However, if you're truly computing things, the number of cores is a good amount. More threads don't make the CPU faster, but they do increase scheduling effort.

Matthias Meid
  • 12,455
  • 7
  • 45
  • 79