3

I am trying to wrap my head around how to handle multiple async/await calls in a foreach loop. I have around 20,000 rows of data that are processed by the foreach loop. Roughly my code is:

foreach (var item in data)
{
    if (ConditionA(item))
    {
        if (ConditionAB(item));
        {
            await CreateThingViaAPICall(item)
        }
        else
        {
            var result = await GetExistingRecord(item);
            var result2 = await GetOtherExistingRecord(result);
            var result3 = await GetOtherExistingRecord(result2);
            //Do processing
            ...           
            await CreateThingViaAPICall();
        }
    }
    ... and so on        
}

I've seen many posts saying the best way to use async in a loop is to build a list of tasks and then use Task.WhenAll. In my case I have Tasks that depend on each other as part of each iteration. How do I build up a list of tasks to execute in this case?

Jacob
  • 619
  • 1
  • 7
  • 18
  • What's the problem with what you have? – Jon Hanna Feb 15 '16 at 00:29
  • If I'm not mistaken the recommended way to use async/wait in a foreach is to build a list of Tasks first then call Task.WhenAll. My issue is that I have multiple tasks per iteration of the loop that depend on each other. How can I build a list of tasks to await in this case? – Jacob Feb 15 '16 at 00:41
  • Did I misunderstand your question? Does each item in the iteration depend on the successful completion of the previous item? – Todd Menier Feb 16 '16 at 17:03
  • Each item in the iteration does depend on each other. Your answer actually really helpful I just could choose both as the answer. – Jacob Feb 17 '16 at 22:19
  • Ah, ok, then I did make an incorrect assumption. Wasn't my intent to swipe the accepted answer but I'm glad it was helpful anyway :) – Todd Menier Feb 18 '16 at 14:50

2 Answers2

3

It's easiest if you break the processing of an individual item into a separate (async) method:

private async Task ProcessItemAsync(Item item)
{
    if (ConditionA(item))
    {
        if (ConditionAB(item));
        {
            await CreateThingViaAPICall(item)
        }
        else
        {
            var result = await GetExistingRecord(item);
            var result2 = await GetOtherExistingRecord(result);
            var result3 = await GetOtherExistingRecord(result2);
            //Do processing
            ...           
            await CreateThingViaAPICall();
        }
    }
    ... and so on
}

Then process your collection like so:

var tasks = data.Select(ProcessItemAsync);
await Task.WhenAll(tasks);

This effectively wraps the multiple dependent Tasks required to process a single item into one Task, allowing those steps to happen sequentially while items of the collection itself are processed concurrently.

With 10's of thousands of items, you may, for a variety of reasons, find that you need to throttle the number of Tasks running concurrently. Have a look at TPL Dataflow for this type of scenario. See here for an example.

Community
  • 1
  • 1
Todd Menier
  • 37,557
  • 17
  • 150
  • 173
  • 2
    Ok that makes sense and combined with @jon-hanna's answer makes even more sense. – Jacob Feb 15 '16 at 03:15
  • The way I understood your question is that item A and item B can be processed concurrently, but the steps involved in processing an individual item must happen sequentially. That's what this approach gives you. – Todd Menier Feb 15 '16 at 03:26
1

If I'm not mistaken the recommended way to use async/wait in a foreach is to build a list of Tasks first then call Task.WhenAll.

You're partly mistaken.

If you have a multiple tasks that don't depend on each other then it is indeed generally a very good idea to have those multiple task happen in a WhenAll so that they can be scheduled together, giving better throughput.

If however each task depends on the results of the previous, then this approach isn't viable. Instead you should just await them within a foreach.

Indeed, this will work fine for any case, it's just suboptimal to have tasks wait on each other if they don't have to.

The ability to await tasks in a foreach is in fact one of the biggest gains that async/await has given us. Most code that uses await can be re-written to use ContinueWith quite easily, if less elegantly, but loops were trickier and if the actual end of the loop was only found by examining the results of the tasks themselves, trickier again.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • So the recommendation to use Task.WhenAll is a performance related recommendation, not a 'bad things will happen if you do this' recommendation? – Jacob Feb 15 '16 at 03:11
  • 1
    Yes. Now, when it's applicable it can make a *big* difference, so certainly do take that approach when you can. You might even be able to do a hybrid; chunks of tasks where each in a chunk depend on each other, but where each chunk can be independent. – Jon Hanna Feb 15 '16 at 03:34