2

I'm not sure how I'm supposed to mix plinq and async-await. Suppose that I have the following interface

public interface IDoSomething (
    Task Do();
}

I have a list of these which I would like to execute in parallel and be able to await the completion of all.

public async Task DoAll(IDoSomething[] doers) {
    //Execute all doers in parallel ideally using plinq and 
    //continue when all are complete
}

How to implement this? I'm not sure how to go from parallel linq to Tasks and vice versa.

I'm not terribly worried about exception handling. Ideally the first one would fire and break the whole process as I plan to discard the entire thing on error.

Edit: A lot of people are saying Task.WaitAll. I'm aware of this but my understanding (unless someone can demonstrate otherwise) is that it won't actively parallelize things for you to multiple available processor cores. What I'm specifically asking is twofold -

  1. if I await a Task within a Plinq Action does that get rid of a lot of the advantage since it schedules a new thread?

  2. If I doers.AsParallel().ForAll(async d => await d.Do()) which takes about 5 second on average, how do I not spin the invoking thread in the meantime?

i3arnon
  • 113,022
  • 33
  • 324
  • 344
George Mauer
  • 117,483
  • 131
  • 382
  • 612
  • `await Task.WhenAll(....)` – L.B Oct 26 '14 at 20:44
  • @L.B `Task.WhenAll` will create a continuation that completes when all tasks complete, it has nothing to do with parallelism. – George Mauer Oct 26 '14 at 21:24
  • George Mauer, All tasks run concurrently. So I wouldn't say `it has nothing to do with parallelism.`. If you knew it that well, you wouldn't ask this question... – L.B Oct 26 '14 at 21:27
  • @L.B Are you certain that tasks run concurrently? I've had multiple people involved with the TPL tell me that it's not about parallelism. Of course I might have misunderstood them... My .Net VM doesn't have enough processor cores to test this properly unfortunately so I'm stuck digging through documentation and recalling what people had told me. – George Mauer Oct 26 '14 at 21:29
  • George, I prepared that complex test case for you. `var sw = Stopwatch.StartNew(); await Task.WhenAll(Task.Delay(5000), Task.Delay(5000)); Console.WriteLine(sw.ElapsedMilliseconds);` It will **not** wait for 10 seconds. (unless you have one CPU with a single core) – L.B Oct 26 '14 at 21:34
  • It makes sense to asynchronously wait for parallel code, but it doesn't make as much sense to parallelize asynchronous code. – Stephen Cleary Oct 26 '14 at 21:34
  • @L.B no, that wouldn't wait for 10 seconds. since both tasks start at the same time they're scheduled at the same time. I don't think it would do it on 1 core even since `Task.Delay` is not `Thread.Sleep`. Even if you did `Thread.Sleep` and the scheduler happens to schedule parallel threads that doesn't mean it actively tries to with long-running computations. My understanding is that is what plinq is for. – George Mauer Oct 26 '14 at 21:47
  • @StephenCleary you're absolutely correct and I don't parallelize asynchronous code by default. It is just in my specific case where I have several simultaneous operations that do a lot of file IO (on an SSD) and calculation each in their own state and file space that I am trying to do this. – George Mauer Oct 26 '14 at 21:49
  • 1
    When the "parallel asynchronous code" question is asked, the proper answer is usually TPL Dataflow. But a simple `Task.WhenAll` combined with `Task.Run`s may suffice, like @l3arnon's answer. – Stephen Cleary Oct 26 '14 at 21:59
  • @StephenCleary so what is the linq `.AsParallel()` stuff all about then? I just sat through a whole conference talk in August about what that does and it seems to be exactly what I want, I just need it to interact with my TPL code somehow. See my edit. – George Mauer Oct 26 '14 at 22:01
  • `I'm aware of this but my understanding....` Why don't you simply write a CPU intensive works and run multiple instance of it using Task.WhenAll. It would be easier to see what is going on, instead of trusting strangers on Internet. – L.B Oct 26 '14 at 22:03
  • @L.B as I said, I would I just don't have the resources in my .Net dev environment to test this reliably. The production environment has much more cores of course. What is odd is that absolutely no one seems to be saying anything about the `AsParallel()` stuff. – George Mauer Oct 26 '14 at 22:04
  • George, I doesn't have to be same code. How do you test your codes? With real data? Just create any *CPU intensive work* and test it to see how *WhenAll* works. – L.B Oct 26 '14 at 22:05
  • @L.B I mean that in my dev environment (a VM) I have 2 processor cores available. I'm not sure that plinq or anything else would parallelize efficiently against only 2 cores. I can write a test, but the results wouldn't tell me what I'm trying to find out. – George Mauer Oct 26 '14 at 22:07
  • George, But you have two cores, So it can execute codes faster than a single CPU/single core machine. Just test your code without any parallesim and than with Task.WhenAll, I really don't get how you get that reputation so far. – L.B Oct 26 '14 at 22:10
  • @L.B Just tried it with a for loop that counts upward. Using plinq as a baseline (because I know that it *does* parallelize stuff) and watching with process explorer I couldn't get it to use both cores. This is what I supposed would happen - one core is kept addressing system things so really only one is available to my application. So no, I can't test this myself and yes, I don't just want some opinion, ideally I'd be pointed at some documentation or just the basics of my question could get answered. – George Mauer Oct 26 '14 at 22:19
  • 1
    @GeorgeMauer: PLINQ is for parallelizing CPU-bound work. Async/await is for concurrent asynchronous work (usually I/O-bound). It makes sense to asynchronously wait for parallel work to complete (`await Task.Run(() => { /* parallel stuff */ })`), but it makes much less sense to try to parallelize asynchronous (generally I/O-bound) work. `WhenAll` by itself can give you asynchronous concurrency, or `WhenAll` wrapping `Task.Run`s can give you a kind of asynchronous parallel foreach. But it's **extremely rare** that you would actually *need* that. – Stephen Cleary Oct 26 '14 at 22:34
  • 2
    @GeorgeMauer: On a side note, I try not to self-advertise here, but I honestly think you would benefit from [my book](http://tinyurl.com/ConcurrencyCookbook). I address concurrency, async, parallel, TPL Dataflow, Rx, and how they all work together. – Stephen Cleary Oct 26 '14 at 22:37
  • Fair enough @StephenCleary I think that's relevant for this. – George Mauer Oct 26 '14 at 22:44
  • 1
    Possible duplicate of [Nesting await in Parallel.ForEach](https://stackoverflow.com/questions/11564506/nesting-await-in-parallel-foreach) – Vitaliy Ulantikov Dec 02 '17 at 20:57

2 Answers2

8

What you're looking for is this:

public Task DoAllAsync(IEnumerable<IDoSomething> doers)
{
    return Task.WhenAll(doers.Select(doer => Task.Run(() => doer.Do())));
}

Using Task.Run will use a ThreadPool thread to execute each synchronous part of the async method Do in parallel while Task.WhenAll asynchronously waits for the asynchronous parts together that are executing concurrently.


This is a good idea only if you have substantial synchronous parts in these async methods (i.e. the parts before an await) for example:

async Task Do()
{
    for (int i = 0; i < 10000; i++)
    {
        Math.Pow(i,i);
    }

    await Task.Delay(10000);
}

Otherwise, there's no need for parallelism and you can just fire the asynchronous operations concurrently and wait for all the returned tasks using Task.WhenAll:

public Task DoAllAsync(IEnumerable<IDoSomething> doers)
{
    return Task.WhenAll(doers.Select(doer => doer.Do()));
}
i3arnon
  • 113,022
  • 33
  • 324
  • 344
  • So what is the difference between something like this and using `AsParallel()` then? You've got to understand why I'm suspicious about answers that seem to ignore half of the question. – George Mauer Oct 26 '14 at 22:09
  • @GeorgeMauer plinq is older than `async-await` and so not so capable of handling async methods. You could use `AsParallel`, but what then? The plinq answer would be to use `ForAll`, but that can't accept an async delegate and doesn't return an awaitable task. The `async` way of doing what you want is `Task.Run` and `Task.WhenAll`, no part of the question is being ignored. – i3arnon Oct 26 '14 at 22:16
  • @GeorgeMauer I see you've edited in a `ForAll` option to your question. That would create an `async void` lambda expression which is extremely dangerous: [Async/Await Best Practices in Asynchronous Programming](http://msdn.microsoft.com/en-us/magazine/jj991977.aspx) – i3arnon Oct 26 '14 at 22:23
  • So wait? Parallel Linq is deprecated? And yeah, the fact that `ForAll` is void is one of the things that's confusing me. Thanks for the additional link – George Mauer Oct 26 '14 at 22:23
  • @GeorgeMauer of course not. It just has a different purpose than the Task-based Asynchronous Pattern – i3arnon Oct 26 '14 at 22:25
  • So - and I swear I'm not trolling - I really am trying to understand how they interrelate then. Are you saying that TPL is a superset of the functionality? Is there something plinq does that TPL does not? If so figuring out how to make them work together is exactly what I'm asking. – George Mauer Oct 26 '14 at 22:27
  • 1
    @GeorgeMauer TPL is the underline framework for both `async-await` and PLinq (and TPL Dataflow and others). PLinq is a solution for parallel processing. `async-await` is a better solution for asynchronous programming than the older options (e.g. `BeginX` & `EndX`). In most cases they have nothing in common. In yours they may have, but they don't fit together well. You could take a look at TPL Dataflow and Reactive Extensions that were built with `async` in mind and can fill the hole PLinq left. – i3arnon Oct 26 '14 at 22:32
  • 1
    Ok, thanks, that starts completing the picture for me. – George Mauer Oct 26 '14 at 22:33
  • what about `items.AsParallel().ForAll( async i => await dbcall1.doStuff(); await dbcall2.dostuff();)`? is there a benefit to using both together when actually doing IO tasks? – ps2goat Sep 28 '16 at 21:53
  • @ps2goat plinq (and linq) don't support async. That means ForAll expects delegates that return void and not a task. So ForAll doesn't actually wait for you lambdas to complete. – i3arnon Sep 28 '16 at 22:51
  • @ps2goat also, since these are async void lambdas an exception inside them will crash the process. – i3arnon Sep 28 '16 at 22:53
2
public async Task DoAll(IDoSomething[] doers) {
    //using ToArray to materialize the query right here
    //so we don't accidentally run it twice later.
    var tasks = doers.Select(d => Task.Run(()=>d.Do())).ToArray();
    await Task.WhenAll(tasks);
}
spender
  • 117,338
  • 33
  • 229
  • 351
  • 90% sure that this will wire everything up asynchronously but the `doers` will not by default run in parallel. It's the whole asynchrony-isnt-parallelism thing. Unless something is specifically telling the TPL to use parallelism for this. Right? – George Mauer Oct 26 '14 at 21:17
  • What I mean to say is unless the TPL specifically has a scheduler that does parallel threads this will just run one at a time (though asynchronously) similar to javascript. [I'm pretty sure thats what point 3 in the async/await FAQ implies](http://blogs.msdn.com/b/pfxteam/archive/2012/04/12/async-await-faq.aspx) – George Mauer Oct 26 '14 at 21:24
  • Ok, so use Task.Run to force it into the ThreadPool (or wherever the default scheduler sends it to run...) – spender Oct 26 '14 at 21:58