4

I have potentially several thousand independent tasks that need to be run. Each of them may make database calls, so they're already leveraging async where possible. That said, if I wanted all of them to run in parallel what's the best way to do this?

I've got this currently:

Parallel.For(0, items.Count, async _ => await PerformTask());

I've also considered using:

List<Task> tasks = new List<Task>();
for(var i = 0; i < items.Count; ++i) tasks.Add(PerformTask());
await Task.WhenAll(tasks); // or possibly Task.WaitAll(tasks.ToArray())

Is there an objectively best way to do this?

Edit: This is different from the marked duplicate question since I'm not asking the difference. I'm asking which way is correct for my use case.

Charles
  • 640
  • 5
  • 21
  • 1
    Possible duplicate of [Parallel.ForEach vs Task.Run and Task.WhenAll](http://stackoverflow.com/questions/19102966/parallel-foreach-vs-task-run-and-task-whenall) – Liam Aug 23 '16 at 15:16
  • for the last part, `Task.WhenAll` is probably preferable, as it returns a task. `Task.WaitAll` returns void, and would block till all complete. – Jonesopolis Aug 23 '16 at 15:16
  • @Liam, not duplicate, see edit. The answers on the linked answer don't actually state what should be used under specific circumstances. And it doesn't seem like there's a set "only do it this way" answer. – Charles Aug 23 '16 at 17:19
  • @Charles, If you need more granular control on each task then you should use Task, other wise go with Paralell.For. – Vinod Aug 23 '16 at 19:21
  • @Vinod, that answers the question, but could you go into more detail? – Charles Aug 24 '16 at 12:39

1 Answers1

3

Parallel is not an option because you have asynchronous actions.

Your options are:

  • Start all of the tasks simultaneously, and then use await Task.WhenAll for them all to complete. You can use SemaphoreSlim if you find you need to throttle the number of active tasks.
  • Use an ActionBlock<T> (from TPL Dataflow) to queue up the work individually. You can use ExecutionDataflowBlockOptions.MaxDegreeOfParallelism if you want to process more than one simultaneously.

The ActionBlock<T> approach would be better if you don't know about all the tasks at the time they're started (i.e., if more can arrive while you're processing), or if other nearby parts of your code will fit into a "pipeline" kind of design.

Task.WhenAll is nice because it doesn't require a separate library with its own design philosophy and learning curve.

Either Task.WhenAll or ActionBlock<T> would work well for your use case.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Gotcha, in my case I do know every task that will be run, so `ActionBlock` looses it's edge I suppose. Overall, the processing time is extremely short of each individual task, up to the the DB call. I'm comfortable with the query pool distributributing that for me, so I'll stick with `Task.WhenAll`. Thanks! – Charles Aug 25 '16 at 12:12