I've been playing around with threading, attempting to push some limits to the extreme - for my own amusement. I know the threadpool defaults to 25 threads and can be pushed up to 1000 (according to MSDN). What though, is the practical limit of threads per CPU core? At some point, context switching is going to cause more of a bottleneck than threading saves. Does anyone have any best practices covering this? Are we talking 100, 200, 500? Does it depend on what the threads are doing? What determines, other than framework dictated architecture how many threads operate optimally per CPU core?
-
7http://stackoverflow.com/questions/1718465/optimal-number-of-threads-per-core - this might answer some queries while you wait for answers here :) – dotalchemy Mar 09 '11 at 01:12
-
@dotalchemy - thanks, anecdotal info is usually pretty helpful; There were some good insights there. It'll be interesting to see if anyone raises info regarding best practices. – BobTheBuilder Mar 09 '11 at 01:17
-
Btw, the ThreadPool has changed dramatically with .NET service packs and CLR versions. .NET 2.0's CLR set the default ThreadPool to 25. The actual limits are: 1023 in .NET 4.0 32 bit, 32768 in .NET 4.0 64 bit, 250 per core in .NET 3.5 SP1, and 25 per core in .NET 2.0 – eduncan911 Jul 20 '12 at 18:50
2 Answers
It's all dependent on what the threads are doing, of course. If they are CPU-bound (say sitting tight in an infinite loop) then one thread per core will be enough to saturate the CPU; any more than that (and you will already have more, from background processes etc) and you will start getting contention.
On the other extreme, if the threads are not eligible to run (e.g. blocked on some synchronization object), then the limit of how many you could have would be dictated by factors other than the CPU (memory for the stacks, OS internal limits, etc).

- 428,835
- 81
- 738
- 806
If your application is not CPU bound (like the majority), then context switches are not a big deal because every time your app has to wait, a context switch is necessary. The problem of having too many threads is about OS data structures and some synchronization anomalies like starvation, where a thread never (or very rarely) gets a chance to execute due to randomness of synchronization algorithms.
If your application is CPU bound (stays 99% of time working on memory, very rarely does I/O or wait for something else such as user input or another thread), then the optimal would be 1 thread per logical core, because in this case there will be no context switching.
Beware that the OS interrupts threads every time, even when there's only one thread for multiple CPUs. The OS interrupts threads not only to make task switching, but also for thread management purposes (like updating counters to show on Task Manager, or to allow a super user to kill it).

- 7,012
- 5
- 40
- 61
-
Let's say you're doing usual line of business stuff, like receiving socket data and transmitting at high speed into a database of some description - for argument sake SQL Server. Theoretically, .NET wouldn't be handling I/O in the true sense - it could be dumping SqlBulkCopy into SQL Server. – BobTheBuilder Mar 09 '11 at 01:31
-
Just by running the OS, there are already a whole lot of threads floating around. So one thread per core is optimal yes, but it just never happens in reality. I fail to see what the point is to know what would be the optimal number of thread per core....your app is not the only one running. – Joel Gauvreau Mar 09 '11 at 01:39
-
2@Joel Gauvreau Call it curiosity. Isn't that why we ended up in this job in the first place. A thirst for knowledge and understanding... so my answer to "what is the point to know?" is "just to know." – BobTheBuilder Mar 09 '11 at 01:57
-
2While the OS has many threads around, none of them are CPU bound (otherwise the OS would be killing watts designed for apps). For CPU bound apps (which is an EXTREMELY RARE case), one thread per core is optimal, period. If the app is "almost" CPU bound (like 90ms CPU and 10ms waiting on each 100ms), then adding more threads is usefull. Because the majority of apps are 90ms waiting and 10ms (or less) CPU, then many threads can be usefull. But still, there is no magic number that works on all cases. – fernacolo Mar 09 '11 at 02:07
-
1If you are receiving socket data and transmiting at high speed into a database, then you need a lot of tunning to get the optimal number! I suggest you make some tests with 10, 20, 40 so 2x while throughput increases. Eventually it will estabilize (it will not decrease if you have a lot of memory). Then you will get a good estimate. Just beware that conditions may vary from time to time, so that no tunning would be perfect. – fernacolo Mar 09 '11 at 02:14
-
@Fernando I agree, so basically, if you are CPU bound, more than 1 thread per CPU is just going to make it worst by forcing the OS to do some context switching. – Joel Gauvreau Mar 09 '11 at 02:17
-
@BobTheBuilder It could be interresting to try a few scenario has a benchmark and find out. – Joel Gauvreau Mar 09 '11 at 02:29