2

When I need some parallel processing I usually do it like this:

static void Main(string[] args)
{
    var tasks = new List<Task>();
    var toProcess = new List<string>{"dog", "cat", "whale", "etc"};
    toProcess.ForEach(s => tasks.Add(CanRunAsync(s)));
    Task.WaitAll(tasks.ToArray());
}

private static async Task CanRunAsync(string item)
{
    // simulate some work
    await Task.Delay(10000);
}

I had cases when this did not process the items in parallel and had to use Task.Run to force it to run on different threads.

What am I missing?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • 4
    A `Task` is **not** a `Thread`. In fact a `Task` can run in the exat same `Thread`. Have a look at https://stackoverflow.com/questions/34375696/executing-tasks-in-parallel – MakePeaceGreatAgain Mar 13 '19 at 14:03
  • A list of tasks doesn't run at all. It's just a list of Task objects, that may actually be cold. What doest `CanRun` do? Does it start any tasks or does it return cold tasks? – Panagiotis Kanavos Mar 13 '19 at 14:05
  • 2
    "_What am I missing?_" The same we are missing, i guess: Context in your question that explains what your tasks are actually doing, and what this CanRun method does to setup and run tasks... –  Mar 13 '19 at 14:06
  • Can you show the `CanRun` method? – Yacoub Massad Mar 13 '19 at 14:06
  • 1
    If you want to process a lot of data in parallel you should use the purpose-built APIs, `Parallel.ForEach` or PLINQ. `Parallel.ForEach(toProcess,processingFunction)` will process all data in parallel. So wil `toProcess.AsParallel().Select(str=>......).ToArray()` – Panagiotis Kanavos Mar 13 '19 at 14:06
  • (As an aside, this could be more tersely written `Task.WaitAll(toProcess.Select(CanRun))`) – canton7 Mar 13 '19 at 14:07
  • 1
    `had to use Task.Run` you *always* have to use that, or TaskFactory.StartNew, if you want to create a hot task, ie one that executes its function – Panagiotis Kanavos Mar 13 '19 at 14:08
  • @PanagiotisKanavos No, that's just false. In fact, basically any method that ever returns a task should return a "hot" task. Any that don't are almost certainly broken. `Task.Run` exists to create a task that runs a synchronous CPU bound operation on a thread pool thread. That's it. If that's not what they are trying to do, there's no need for `Task.Run`. – Servy Mar 13 '19 at 14:20
  • @servy and we don't know what the OP wants to do yet. Starting another nitpicking round isn't going to help anyone. – Panagiotis Kanavos Mar 13 '19 at 14:22
  • 1
    @PanagiotisKanavos Of course we don't know what the OP wants to do. But your statement is just wrong, regardless of what the OP wants to do. You stating false thing about what `Task.Run` does or how to use it is harmful, correcting those misconceptions *is* helping people. – Servy Mar 13 '19 at 14:36
  • @Servy OK, you win, you're right as always and comments should be downvotable – Panagiotis Kanavos Mar 13 '19 at 14:42
  • @PanagiotisKanavos: Tasks always return hot. I'm not really sure what a "cold" `Task` would be unless it's just an already completed `Task`, such as from `Task.CompletedTask` or `Task.FromResult`. – Chris Pratt Mar 13 '19 at 19:14
  • @ChrisPratt you'll find a LOT of SO questions with people creating cold tasks with `new Task()` then starting them with `.Start()` as if they were threads. It's quite a common problem. If you search for `[c#] "new Task("` you'll find quite a few cases. The funny thing is, if you read Servy's first comment he agrees. His disagreement seems to be that I didn't include a full Intro To Tasks in a two-line comment about a question whose original code and wording didn't make clear what Task was returned. `had to use Task.Run to force it to run` sounded like a new'd Task was returned – Panagiotis Kanavos Mar 15 '19 at 07:59
  • @ChrisPratt and of course, nobody rememberd to mention TaskCompletionSource, which makes everyone wrong I guess – Panagiotis Kanavos Mar 15 '19 at 08:04
  • 1
    @PanagiotisKanavos My disagreement is that you said that `Task.Run` is the only way to create a hot task, which is just demonstrably false, and not the proper way to get a hot task for lots of common operations. Adding the statement that the OP *needs* to use `Task.Run` is both wrong, confusing, and harmful, and adds nothing useful to the conversation at all. I didn't criticize you for not saying more, you should simply have never posted that one comment. It adds nothing useful, is harmful, and is just false. Saying nothing is better than saying something wrong and misleading. – Servy Apr 10 '19 at 13:34

2 Answers2

13

Task means "a thing that needs doing, which may have already completed, may be executing on a parallel thread, or may be depending on out-of-process data (sockets, etc), or might just be ... connected to a switch somewhere that says 'done'" - it has very little to do with threading, other than: if you schedule a continuation (aka await), then somehow that will need to get back onto a thread to fire, but how that happens and what that means is up to whatever code created and owns the task.

Note: parallelism can be expressed in terms of multiple tasks (if you so choose), but multiple tasks doesn't imply parallelism.

In your case: it all depends on what CanRun does or is - and we don't know that. It should also probably be called CanRunAsync.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Thank you for your answer. So what you're saying is that it doesn't matter if I use `Task.Run` or `Task.Factory.StartNew` or just add the async method directly to the tasklist, there is no guarantee that they will run on different threads? – worker-25847 Mar 13 '19 at 15:19
  • 1
    @Alex no, that's not what I'm saying; those things *are* thread-related, although there's no guarantee how many will run in parallel (that's up to the scheduler); I'm responding to the question in the original form, when `CanRun` was vague: **in the general case**: task doesn't mean thread. – Marc Gravell Mar 13 '19 at 15:38
0

I had cases when this did not process the items in parallel and had to use Task.Run to force it to run on different threads.

Most likely these cases were associated with methods that have an asynchronous contract, but their implementation is synchronous. Like this method for example:

static async Task NotAsync(string item)
{
    Thread.Sleep(10000); // Simulate a CPU-bound calculation, or a blocking I/O operation
    await Task.CompletedTask;
}

Any thread that invokes this method will be blocked for 10 seconds, and then it will be handed back an already completed task. Although the contract of the NotAsync method is asynchronous (it has an awaitable return type), its actual implementation is synchronous because it does all the work during the invocation. So when you try to create multiple tasks by invoking this method:

toProcess.ForEach(s => tasks.Add(NotAsync(s)));

...the current thread will be blocked for 10 seconds * number of tasks. When these tasks are created they are all completed, so waiting for their completion will cause zero waiting:

Task.WaitAll(tasks.ToArray()); // Waits for 0 seconds

By wrapping the NotAsync in a Task.Run you ensure that the current thread will not be blocked, because the NotAsync will be invoked on the ThreadPool.

toProcess.ForEach(s => tasks.Add(Task.Run(() => NotAsync(s))));

The Task.Run returns immediately a Task, with guaranteed zero blocking.

It should be noted that writing asynchronous methods with synchronous implementations violates Microsoft's guidelines:

An asynchronous method that is based on TAP can do a small amount of work synchronously, such as validating arguments and initiating the asynchronous operation, before it returns the resulting task. Synchronous work should be kept to the minimum so the asynchronous method can return quickly.

But sometimes even Microsoft violates this guideline. That's because violating this one is better than violating the guideline about not exposing asynchronous wrappers for synchronous methods. In order words exposing APIs that call Task.Run internally in order to give the impression of being asynchronous, is an even greater sin than blocking the current thread.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104