0

How do I add a finalizer that runs once all parallels have completed?

Parallel.ForEach(entries, new ParallelOptions { MaxDegreeOfParallelism = 15 }, async (entry) =>
    // Do something with the entry.
});

I have tried like this but it doesn't compile:

Parallel.ForEach(entries, new ParallelOptions { MaxDegreeOfParallelism = 15 }, async (entry) =>
    // Do something with the entry.
}, () => { // Was hoping this would work. });
Alexandru
  • 12,264
  • 17
  • 113
  • 208
  • 1
    Parallel.ForEach does ***NOT*** work correctly when you do `async (entry) =>`, Do not use async methods with it. You must use non async methods or switch to somthing that does like [TPL Dataflow](https://msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx). – Scott Chamberlain Jun 22 '16 at 16:29
  • 1
    `Parallel.ForEach` doesn't have an overload for a function returning `Task`, so your async function is just going to return immediately and the iteration will complete near-immediately. See [this question](http://stackoverflow.com/questions/11564506/nesting-await-in-parallell-foreach). – Charles Mager Jun 22 '16 at 16:29

3 Answers3

3
  1. You should not declare the action for the Parallel.ForEach as async. If you use an await inside that action, the control flow is returned to Parallel.ForEach and its implementation "thinks" that the action is finished. This will lead to a very different behaviour than you expect.

  2. The call to Parallel.ForEach returns when the loop is completed. It returns when all actions have been done for all the elements in the enumeration. So whatever you want to do "when all parallels have completed" can be done right after that call:

    Parallel.ForEach(entries, new ParallelOptions { MaxDegreeOfParallelism = 15 }, 
             (entry) =>
             // Do something with the entry.
    );
    DoSomethingWhenAllParallelsHaveCompleted();
    
René Vogt
  • 43,056
  • 14
  • 77
  • 99
  • How would I await inside of the action then? I need to use an `HttpClient` to send data, and it has a `SendAsync` method on it. – Alexandru Jun 22 '16 at 16:33
  • @Alexandru You would need to use `.Wait()` or `.Result` instead. Just do it synchronous. – René Vogt Jun 22 '16 at 16:35
  • Can you provide an example? `.Result` can deadlock, and I am not sure if `.Wait()` guarantees a result afterwards? This `async` shit is a mess. – Alexandru Jun 22 '16 at 16:36
  • @Alexandru actually `async` is really cool, it's just hard to combine with "normal" threads and `Parallel`. An example is hard to give as it depends on what you are really doing. I can't even see why `.Result` should lead to a deadlock where you're `async` should have worked. – René Vogt Jun 22 '16 at 16:39
  • http://stackoverflow.com/questions/17248680/await-works-but-calling-task-result-hangs-deadlocks – Alexandru Jun 22 '16 at 16:44
  • 1
    @Alexandru Use [TPL Dataflow](https://msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx), a your Parallel.ForEach is a `new ActionBlock(async (entry) => ..., new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 15 })` – Scott Chamberlain Jun 22 '16 at 16:48
  • @Alexandru as I said, it depends on what you are doing. Parallel or asynchronous execution can boil your brain...as a 3k user you know that this is a Q&A site, so maybe you start a new question about your specific problems with that action, as this is getting too broad to discuss in comments. – René Vogt Jun 22 '16 at 16:49
  • @ScottChamberlain Thanks Scott, I will look into it. – Alexandru Jun 22 '16 at 17:41
1

You dont have to do anything. Your Parallel.ForEach will run until all threads have finished their work. Thats one of the really nice benefits the Parallel.Foreach() has.

So right after Parallel.ForEach(() => { /* code */ }); all threads will be finished.

C4d
  • 3,183
  • 4
  • 29
  • 50
1

As I and others mentioned in comments Parallel.ForEach does not support async functions the reason is when you did async (entry) => ... that is the same as

Parallel.ForEach(entries, Example);

//elsewhere
async void Example(Entry entry)
{
   ...
}

Because the function is async void the ForEach can't tell when a function is "done" so it will just think it is done when you hit the first await instead of when the task finishes.

The way to fix this is to use a library that can suport async functions, TPL Dataflow is a good one. You get it by installing the NuGet package to your project Microsoft.Tpl.Dataflow. You could recreate your previous code as

private const int MAX_PARALLELISM = 15

public async Task ProcessEntries(IEnumerable<Entry> entries)
{
    var block = new ActionBlock<Entry>(async (entry) => 
                                       {
                                           //This is now a "async Task" instead of a async void
                                       }, 
                                       new ExecutionDataflowBlockOptions 
                                       { 
                                           MaxDegreeOfParallelism = MAX_PARALLELISM

                                       });
    foreach(var entry in entries)
    {
        await block.SendAsync(entry);
    }

    block.Complete();
    await block.Completion;

    DoExtraWorkWhenDone();
}
Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431
  • Thank you so much for this bit of advice, it will go a long, long way. This is exactly what I was looking for. Even beyond that, this is what `Task` should have had all along. – Alexandru Jun 22 '16 at 18:39
  • I wish I could up-vote you more. I was having one of the worst days I've ever had at work, with hyenas breathing down my neck every few seconds for status updates, meanwhile I was trying to concentrate on writing some quality code. So, ultimately my thanks to you Scott, you saved my life today. Sometimes people don't realize the extent of the impacts they can have on others, but this was huge. Thanks again, man. Just wanted to share that with you :) – Alexandru Jun 22 '16 at 20:27
  • A follow up to this is how people can make use of and leverage `Func` within TPL blocks: http://www.dima.to/blog/?p=322 and this is also a good read: http://blog.stephencleary.com/2014/02/synchronous-and-asynchronous-delegate – Alexandru Jul 15 '16 at 02:34