39

Can someone list some comparison points between Thread Spawning vs Thread Pooling, which one is better? Please consider the .NET framework as a reference implementation that supports both.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
reonze
  • 427
  • 1
  • 4
  • 5

12 Answers12

27

A "pool" contains a list of available "threads" ready to be used whereas "spawning" refers to actually creating a new thread.

The usefulness of "Thread Pooling" lies in "lower time-to-use": creation time overhead is avoided.

In terms of "which one is better": it depends. If the creation-time overhead is a problem use Thread-pooling. This is a common problem in environments where lots of "short-lived tasks" need to be performed.


As pointed out by other folks, there is a "management overhead" for Thread-Pooling: this is minimal if properly implemented. E.g. limiting the number of threads in the pool is trivial.

jldupont
  • 93,734
  • 56
  • 203
  • 318
  • 2
    is there a big difference( in terms of time consumption) in creating a new thread as compared to picking a task from a queue and allocating it to a thread from a pool, on windows xp OS? – reonze Jan 12 '10 at 15:21
  • I haven't done any serious Windows programming in ages but I am pretty sure that picking an available thread from a list (simple operation) is far cheaper than spawning a new thread. – jldupont Jan 12 '10 at 15:38
  • 2
    Spawning a new physical thread will mean a transition into Kernel mode and setting up a bunch of mernel structures, so the overhead will be higher. But as we all know you shouldn't optimize prematurely. If you only create one thread you're not going to notice the overhead, if you are creating thousands you probably will, but even that probably depends on the work the thread is doing which may swamp the time it takes to create the thread. – Kevin Jones Jan 12 '10 at 16:39
27

Thread pool threads are much cheaper than a regular Thread, they pool the system resources required for threads. But they have a number of limitations that may make them unfit:

  • You cannot abort a threadpool thread
  • There is no easy way to detect that a threadpool completed, no Thread.Join()
  • There is no easy way to marshal exceptions from a threadpool thread
  • You cannot display any kind of UI on a threadpool thread beyond a message box
  • A threadpool thread should not run longer than a few seconds
  • A threadpool thread should not block for a long time

The latter two constraints are a side-effect of the threadpool scheduler, it tries to limit the number of active threads to the number of cores your CPU has available. This can cause long delays if you schedule many long running threads that block often.

Many other threadpool implementations have similar constraints, give or take.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • 4
    This answer is totally language / platform specific, but worded as if it would be a general answer. Since the OP did neither mention a platform nor a language: -1 – Max Truxa May 22 '13 at 16:32
10

For some definition of "better", you generally want to go with a thread pool. Without knowing what your use case is, consider that with a thread pool, you have a fixed number of threads which can all be created at startup or can be created on demand (but the number of threads cannot exceed the size of the pool). If a task is submitted and no thread is available, it is put into a queue until there is a thread free to handle it.

If you are spawning threads in response to requests or some other kind of trigger, you run the risk of depleting all your resources as there is nothing to cap the amount of threads created.

Another benefit to thread pooling is reuse - the same threads are used over and over to handle different tasks, rather than having to create a new thread each time.

As pointed out by others, if you have a small number of tasks that will run for a long time, this would negate the benefits gained by avoiding frequent thread creation (since you would not need to create a ton of threads anyway).

danben
  • 80,905
  • 18
  • 123
  • 145
  • 2
    There is no reason why you couldn't cap the number of spawned threads. – Tony Edgecombe Jan 12 '10 at 15:21
  • 1
    That's true, I didn't claim it was impossible. But then why not use a pool? – danben Jan 12 '10 at 15:23
  • Well, unless 'generally' is a relatively long running task (anything over a few seconds is, really). (Ab)using the threadpool for long running operations could lead to nasty forms of deadlocks etc. – Wim Jan 12 '10 at 15:24
  • 1
    Creating new threads is a more expensive operation than one might expect. This makes it especially worthwhile if many very short tasks are being executed. – Thorarin Jan 12 '10 at 15:24
  • +1 for pointing out that it's more about resource control than cost-to-spawn – kdgregory Jan 12 '10 at 15:24
  • 1
    @Wim: what is *the* thread pool you are referring to? This is a language and platform agnostic question. – Thorarin Jan 12 '10 at 15:27
  • Thorarin, yes so equally read it as such. Assuming the current platform/runtime has a built in threadpool seems reasonable, given OP's question. – Wim Jan 12 '10 at 15:31
  • @Wim: If I'm using .net Threadpool (limit of 100 threads) for thousands of tasks and some of them may take more than ten seconds, in this scenario can you tell us how a deadlock occurs? At some point of time thread pool will finish all the tasks as every thread will finish the task for sure (in my scenario). I'm not clear how deadlock occurs here. Can you clarify that some more? – JPReddy Sep 16 '10 at 12:51
  • @JPReddy: It doesn't *just* happen. You'd have to have a scenario where you have dependencies; one of the threads waits for the result of another. If there are no threads available, deadlock could easily occur. – Wim Sep 25 '10 at 19:55
6

All depends on your scenario. Creating new threads is resource intensive and an expensive operation. Most very short asynchronous operations (less than a few seconds max) could make use of the thread pool.

For longer running operations that you want to run in the background, you'd typically create (spawn) your own thread. (Ab)using a platform/runtime built-in threadpool for long running operations could lead to nasty forms of deadlocks etc.

Wim
  • 11,998
  • 1
  • 34
  • 57
6

My feeling is that you should start just by creating a thread as needed... If the performance of this is OK, then you're done. If at some point, you detect that you need lower latency around thread creation you can generally drop in a thread pool without breaking anything...

dicroce
  • 45,396
  • 28
  • 101
  • 140
3

Thread pooling is usually considered better, because the threads are created up front, and used as required. Therefore, if you are using a lot of threads for relatively short tasks, it can be a lot faster. This is because they are saved for future use and are not destroyed and later re-created.

In contrast, if you only need 2-3 threads and they will only be created once, then this will be better. This is because you do not gain from caching existing threads for future use, and you are not creating extra threads which might not be used.

winwaed
  • 7,645
  • 6
  • 36
  • 81
2

It depends on what you want to execute on the other thread.

For short task it is better to use a thread pool, for long task it may be better to spawn a new thread as it could starve the thread pool for other tasks.

Jeff Cyr
  • 4,774
  • 1
  • 28
  • 42
  • 1
    It is also nice for performance analysis and logging that you can rename a spawned thread and thus make it easy for performance analysis - if it is a threadpool it gets more difficult to guess which thread belongs to which task. – weismat Jan 12 '10 at 15:33
2

The main difference is that a ThreadPool maintains a set of threads that are already spun-up and available for use, because starting a new thread can be expensive processor-wise.

Note however that even a ThreadPool needs to "spawn" threads... it usually depends on workload - if there is a lot of work to be done, a good threadpool will spin up new threads to handle the load based on configuration and system resources.

Michael Bray
  • 14,998
  • 7
  • 42
  • 68
0

For Multi threaded execution combined with getting return values from the execution, or an easy way to detect that a threadpool has completed, java Callables could be used.

See https://blogs.oracle.com/CoreJavaTechTips/entry/get_netbeans_6 for more info.

Kees van Dieren
  • 1,232
  • 11
  • 15
0

There is little extra time required for creating/spawning thread, where as thread poll already contains created threads which are ready to be used.

0

This answer is a good summary but just in case, here is the link to Wikipedia:

http://en.wikipedia.org/wiki/Thread_pool_pattern

Community
  • 1
  • 1
Kelly S. French
  • 12,198
  • 10
  • 63
  • 93
0

Assuming C# and Windows 7 and up...

When you create a thread using new Thread(), you create a managed thread that becomes backed by a native OS thread when you call Start – a one to one relationship. It is important to know only one thread runs on a CPU core at any given time.

An easier way is to call ThreadPool.QueueUserWorkItem (i.e. background thread), which in essence does the same thing, except those background threads aren’t forever tied to a single native thread. The .NET scheduler will simulate multitasking between managed threads on a single native thread. With say 4 cores, you’ll have 4 native threads each running multiple managed threads, determined by .NET. This offers lighter-weight multitasking since switching between managed threads happens within the .NET VM not in the kernel. There is some overhead associated with crossing from user mode to kernel mode, and the .NET scheduler minimizes such crossing.

It may be important to note that heavy multitasking might benefit from pure native OS threads in a well-designed multithreading framework. However, the performance benefits aren’t that much.

With using the ThreadPool, just make sure the minimum worker thread count is high enough or ThreadPool.QueueUserWorkItem will be slower than new Thread(). In a benchmark test looping 512 times calling new Thread() left ThreadPool.QueueUserWorkItem in the dust with default minimums. However, first setting the minimum worker thread count to 512, in this test, made new Thread() and ThreadPool.QueueUserWorkItem perform similarly.

A side effective of setting a high worker thread count is that new Task() (or Task.Factory.StartNew) also performed similarly as new Thread() and ThreadPool.QueueUserWorkItem.