0

Consider this piece of code, where there is some work being done within a for loop, and then a recursive call to process sub items. I wanted to convert DoSomething(item) and GetItems(id) to async methods, but if I await on them here, the for loop is going to wait for each iteration to finish before moving on, essentially losing the benefit of parallel processing. How could I improve the performance of this method? Is it possible to do it using async/await?

public void DoWork(string id)
        {            
                var items = GetItems(id);  //takes time

                if (items == null)
                    return;

                Parallel.ForEach(items, item =>
                {
                    DoSomething(item); //takes time
                    DoWork(item.subItemId);                    
                });           
        }
Prabhu
  • 12,995
  • 33
  • 127
  • 210
  • possible duplicate of [What is the correct way to use async/await in a recursive method?](http://stackoverflow.com/questions/25007709/what-is-the-correct-way-to-use-async-await-in-a-recursive-method) – ZombieSheep Jul 29 '14 at 06:27
  • @ZombieSheep the other question was asked by me too. There is a difference between the two. The other is to do with recursion and doesn't discuss about performance. This one is in a for loop, and I'm wondering how to get the loop going without awaiting, while performing an async operation inside the loop. – Prabhu Jul 29 '14 at 06:30
  • I would suggest you would get better quality answers by either figuring out what part of the process is the issue, or combining the two questions. That way any suggestions given would take account of the rest of the information. – ZombieSheep Jul 29 '14 at 06:32
  • They are two completely different questions though--I didn't want to combine them and muddle the discussion. The other was one about whether async/await was safe in a recursive method. This is about how to use async/await effectively in a loop. The other one was not about performance at all. – Prabhu Jul 29 '14 at 06:38
  • Is the work you do in `DoSomething`/`DoWork` indeed async? If not you gain nothing at all by making them async. If yes just start each in a thread and wait for all of them if you have to go parallel - still I would stick with the `Parallel.ForEach` – Random Dev Jul 29 '14 at 07:16
  • @CarstenKönig I could convert DoSomething into an async method. But if I was going to place DoSomething in a thread as you suggested, is it really needed that DoSomething should be async? – Prabhu Jul 29 '14 at 07:19
  • no it is not - but if there are no async. task involved (like file IO or something) than you gain nothing by going async (indeed you might even lose performance) – Random Dev Jul 29 '14 at 07:25
  • there are many ways to solve this (a Barrier comes to mind) but you need to know the problem to give low level advice like this – Random Dev Jul 29 '14 at 07:29
  • Possible duplicate of [Async and Await with For Loop](http://stackoverflow.com/questions/26069487/async-and-await-with-for-loop) – Michael Freidgeim Jun 14 '16 at 13:28

1 Answers1

3

Instead of using Parallel.ForEach to loop over the items you can create a sequence of tasks and then use Task.WhenAll to wait for them all to complete. As your code also involves recursion it gets slightly more complicated and you need to combine DoSomething and DoWork into a single method which I have aptly named DoIt:

async Task DoWork(String id) {
  var items = GetItems(id);
  if (items == null)
    return;
  var tasks = items.Select(DoIt);
  await Task.WhenAll(tasks);
}

async Task DoIt(Item item) {
  await DoSomething(item);
  await DoWork(item.subItemId);
}

Mixing Parallel.ForEach and async/await is a bad idea. Parallel.ForEach will allow your code to execute in parallel and for compute intensive but parallelizable algorithms you get the best performance. However async/await allows your code to execute concurrently and for instance reuse threads that are blocked on IO operations.

Simplified Parallel.ForEach will setup as many threads as you have CPU cores on your computer and then partition the items you are iterating to be executed across these threads. So Parallel.ForEach should be used once at the bottom of your call stack where it will then fan out the work to multiple threads and wait for them to complete. Calling Parallel.ForEach in a recursive manner inside each of these threads is just crazy and will not improve performance at all.

Martin Liversage
  • 104,481
  • 22
  • 209
  • 256
  • with this method, will it be a problem if a recursive call is part of the tasks that are being run in a parallel? In my question, you can see DoWork(), which is a recursive call, is being called right after DoSomething(). Thanks. – Prabhu Jul 29 '14 at 14:49
  • @Prabhu: I did not notice the recursive call. Then you should definitely avoid using `Parallel.ForEach` but I do not see any problem in using tasks as I have described. Then you get a hierarchy of tasks from the recursion and as long as you can accommodate the state of all the tasks in your process you should be fine. – Martin Liversage Jul 29 '14 at 15:05
  • Thanks @Martin. I am having trouble constructing my lambda expression as you've suggested with the multiple lines DoSomethign and DoWork. Could you please give me a pointer? Also, why should Parallel.ForEach be avoided? – Prabhu Jul 29 '14 at 15:12