6

I overheard a coworker saying that a Task is basically a lightweight thread. Coming from a C++ background (where threads where the lightest weight processing unit), this seems counter-intuitive to me.

Aren't Tasks just as heavy as Threads?

Richard
  • 6,215
  • 4
  • 33
  • 48

3 Answers3

9

You need to distinguish between a unit of work (Tasks) from the underlying process used to host/execute them. It isn't even necessary for Tasks to run on other threads. For example, Tasks can be executed in a single threaded application that periodically yields control to the task pool.

Even when Tasks are executed on separate threads, there is usually not a 1 to 1 relationship between Task and Thread. The threads are preallocated as part of a pool, and then tasks are scheduled to run on these threads as available. Creating a new task does not require the overhead of creating a thread, it only requires the cost of an enque in a task queue.

This makes tasks inherently more scalable. I can have millions of tasks throughout the lifetime of my application, but only ever actually use some constant number of threads.

Chris Pitman
  • 12,990
  • 3
  • 41
  • 56
  • 2
    "This makes tasks inherently more scalable." It should be noted that this isn't unique to `Task`. The older `ThreadPool` class provides the same scalability, as do IO Control Ports, or just implementing a Producer/Consumer queue with a fixed number of consumer `Thread`s. – mbeckish Jun 14 '13 at 14:56
  • 3
    @mbeckish You are correct, I believe the standard implementation is exactly that, an ease of use layer on top of ThreadPool. – Chris Pitman Jun 14 '13 at 15:08
2

Typically a "thread" implies mandatory concurrency. Starting up a thread requires allocating a stack and internal OS data structures for it. In contrast, a "task" often refers to a piece of work for which concurrency is optional, hence a parallel framework (such as OpenMP, Cilk Plus, TBB, PPL) can use the same thread to execute many tasks, by serializing the tasks, and converting optional parallelism to real parallelism only as necessary to keep the machine busy.

Arch D. Robison
  • 3,829
  • 2
  • 16
  • 26
  • I'm getting hung up on the idea of "mandatory concurrency". If you have more threads than you have cores, you'll be serializing them *at some level*. How does the idea of mandatory concurrency apply to large number of threads. (I can understand what you're saying from the context of the other answers. This term is just confusing.) – Richard Jun 14 '13 at 15:40
  • @Richard - The cost of a .NET `Thread` object is independent of the number of cores, or even the number of kernel threads to which the managed threads are mapped. The cost is 1) roughly 1 MB memory allocated per `Thread` object, and 2) the CPU cost of context switching when you have more threads than cores. The savings provided by `Task` or `ThreadPool` is that your application can create 1000 `Task` objects while .NET might handle them with only 20 `Thread` objects. – mbeckish Jun 14 '13 at 16:57
  • More correctly, 'the CPU cost of context switching when you have more ready threads than cores'. – Martin James Jun 14 '13 at 17:03
  • @mbeckish I'm not confused about the cost of threads, but rather how threads imply mandatory concurrency. If you have more threads than cores, they will be switched in and out, implying that they are executed (at least in part) sequentially. To me, it seems the term "mandatory concurrency" is a either misnomer or a misunderstanding. – Richard Jun 14 '13 at 17:18
  • @Richard - I'm pretty sure "mandatory concurrency" means "creating N Thread objects concurrently for N concurrent tasks", which is costly, as opposed to "creating some k < N Thread objects concurrently for N concurrent tasks". In other words, if the developer is not going to manually throttle the concurrent work and instead wants to immediately set up all tasks to be scheduled, the bad option is to create N `Thread` objects for your N tasks, and the good option is to create N `Task` objects, which behind the scenes will use less than N `Thread` objects. – mbeckish Jun 14 '13 at 17:28
  • The difference between "mandatory concurrency" and "optional concurrency" is that "optional concurrency" has weaker progress guarantees. Suppose I have a shared bounded queue, and a producer activity enqueuing items and a consumer activity dequeuing items. If each activity is a thread, no deadlock occurs. If each activity is a task, deadlock might occur if the runtime decides to serialize the tasks in a way that attempts to run the producer to completion before starting the consumer, or vice-versa. – Arch D. Robison Jun 14 '13 at 18:49
1

You are right - everything runs on a thread under the covers.

The reason people say that a Task is more lightweight than a Thread is that Microsoft put a lot of thought into having Tasks make efficient use of Threads, and the implementation is probably much lighter weight than what the average developer would come up with on their own using the Thread class.

EDIT

A more clear explanation is that a Task object is lighter weight than a Thread object, and while each Task is eventually run on a Thread, creating N Task objects concurrently leads to less than N concurrent Thread objects being used, for large N.

mbeckish
  • 10,485
  • 5
  • 30
  • 55