9

So I spent the better part of the night trying to figure this out.

I was fortunate to get introduced to Parallel.ForEach yesterday and it works like I want it to do except from one detail.

I have the following:

Parallel.ForEach(data, (d) =>
{
    try
    {
        MyMethod(d, measurements);
    }
    catch (Exception e)
    {
        // log
    }
});

Within the method "MyMethod" I have a lot of logic that gets done and most of it is fine but I make API calls where I fetch data and I use an async task for this to be able to use await in order for the code to wait until that specific part gets executed and then move on:

private async void MyMethod(PimData pimData, IEnumerable<ProductMeasurements> measurements)
{
    try
    {
        // a lot of logic but most relevant part 

        await Task.WhenAll(ExecuteMeasurmentAndChartLogic(pimData.ProductNumber, entity));

        await Task.WhenAll(resourceImportManager.HandleEntityImageFiles(pimData.ProductType + pimData.ProductSize,SwepImageType.Png, ResourceFileTypes.ThreeD, entity, LinkTypeId.ProductResource));

        await Task.WhenAll(resourceImportManager.HandleEntityImageFiles(pimData.ProductSketch, SwepImageType.Png, ResourceFileTypes.Sketch, entity, LinkTypeId.ProductResource));

    }
    catch (Exception e)
    {
        // log
    }
}

Problems:

1 For starters the loop finishes before all code is finished

2 Second problem is that I get "Task was cancelled" in a lot of API calls

3 And third as mentioned above, the code does not wait for each method to fully execute.

I cant get it to execute everything in ExecuteMeasurmentAndChartLogic() method before moving forward to the next step.

This gives me the following issues (more issues):

In this method I create an item and add it to the db, and this item needs more info that I get from an API call that is done inside of ExecuteMeasurmentAndChartLogic() but the problem is that several items get created and have to wait for the rest of the data which is not what I desire.

SIDE-NOTE: I am aware that creating an item and adding to the db before all data is there is not best practice but I am integrating toward PIM and the process for that is delicate

I want several threads running but at the same time I want the full logic to execute for each item before moving on to the next method.

Clarification:

Several items running

Each item handles ALL the logic it needs to handle before moving on to the next part of the code. Normally do this with await.

In the code above resourceImportManager() method gets executed before ExecuteMeasurmentAndChartLogic() is finished, which is what I don't want.

Instead of a Parallel.ForEach I used :

Task task1 = Task.Factory.StartNew(() => MyMethod(data, measurements));
Task.WaitAll(task1);

But that wasn't much of a help.

Fairly new to this and haven't been able to understand where I am doing it wrong.

EDIT: Updated the problems with this

EDIT: this is how ExecuteMeasurmentAndChartLogic() looks:

public async Task ExecuteMeasurmentAndChartLogic(string productNumber, Entity entity)
{
    try
    {
        GrafGeneratorManager grafManager = new GrafGeneratorManager();
        var graphMeasurmentList = await MeasurmentHandler.GetMeasurments(productNumber);

        if (graphMeasurmentList.Count == 0) return;

        var chart = await grafManager.GenerateChart(500, 950, SystemColors.Window, ChartColorPalette.EarthTones,
                    "legend", graphMeasurmentList);

        await AddChartsAndAddToXpc(chart, entity, productNumber);
    }
    catch (Exception e)
    {
        Console.WriteLine(e);
    }
}

EDIT: Background to this: I make a call to an API to get a lot of data. For each item in this data I need to make an API call and get data that I apply to the item.

After reading comments which also got me thinking in a different way. I can perhaps loop through all my items and do minor logic for them and add a URL in a task list and make a separate task that executes this one by one.

Will keep this updated

Josh Gallagher
  • 5,211
  • 2
  • 33
  • 60
ThunD3eR
  • 3,216
  • 5
  • 50
  • 94
  • Use callback delegates once the processing complete. – Ramankingdom Jul 28 '17 at 08:48
  • Any particular reason why `MyMethod` is an `async void` and not an `async Task`? – GWigWam Jul 28 '17 at 08:59
  • @GWigWam no particular reason, was meaning to change it, thanks for reminding me – ThunD3eR Jul 28 '17 at 09:00
  • 3
    What does `resourceImportManager.HandleEntityImageFiles` return. For a starter it looks like you are using WaitAll and WhenAll for single tasks. Also, async void is a big no no, you could probably change `Parallel.ForEach` to `Task.WhenAll` with some refactoring. – Peter Bons Jul 28 '17 at 09:00
  • 4
    Also, parallel.for is better for CPU bound work, and Task.WhenAll for I/O bound work. using async/await in parellel for is a bit of a design flaw imho – Peter Bons Jul 28 '17 at 09:02
  • @Peter Bons resourceImportManager.HandleEntityImageFiles doesn't return anything, I just want it to execute everything in there before moving on. – ThunD3eR Jul 28 '17 at 09:02
  • if it does not return anything, how can it be a parameter of Task.WhenAll ? – Peter Bons Jul 28 '17 at 09:03
  • @Peter Bons: public async Task HandleEntityImageFiles was a void before I tried every possible thing I can think of to make it execute everything – ThunD3eR Jul 28 '17 at 09:04
  • 4
    To echo comments above, either it's **CPU-bound**, in which case you should use `Parallel.ForEach`, or it's **IO-bound**, in which case you should use async. – Tim Rogers Jul 28 '17 at 09:09
  • then you should use `await HandleEntityImageFiles` instead of await `await Task.WhenAll(HandleEntityImageFiles)` as it returns a single Task. Now, you can run multiple tasks concurrent but then I would expect something like `await Task.WhenAll(ExecuteMeasurmentAndChartLogic(pimData.ProductNumber, entity), resourceImportManager.HandleEntityImageFiles(xxx), resourceImportManager.HandleEntityImageFiles(yyy);` – Peter Bons Jul 28 '17 at 09:09
  • 3
    But I suggest you read some more tutorials before creating such logic as you are mixing up too many things and this is not te best place to explain all these things. – Peter Bons Jul 28 '17 at 09:10
  • @Peter Bons I did use await before using Task.WhenAll, gave me the same result. As for more reading, I agree and I have but I need to make an optimization and I am trying to learn as I go. – ThunD3eR Jul 28 '17 at 09:14
  • Consider TPL Dataflow. – Stephen Cleary Jul 28 '17 at 15:17
  • @Peter Bons could you share a thought on my solution below.Thank you! – ThunD3eR Jul 28 '17 at 20:02

1 Answers1

17

Don't use Parralel.ForEach at all. Make your method to return Task instead of void, collect all the task and wait them like:

Task.WaitAll(data.Select(d => MyMethod(d, someParam)).ToArray());
Eduard Lepner
  • 639
  • 6
  • 17
  • This does give me a better result thus far but I am concerned, does it actually execute the logic for each item simultaneously? Seems to go alot slower than the loop I started with – ThunD3eR Jul 28 '17 at 09:58
  • The way I see this is that each item gets a task but item two waits for item one to execute, that is not what I want, gets me back to a normal foreach loop – ThunD3eR Jul 28 '17 at 10:08
  • 1
    The code that I provided is Asynchronous and Parallel. It might be executed in one thread if Thread pool configuration is broken or there're insufficient threads to capture. Or maybe MyMethod has bottleneck and actually access the same blocking resource. However if this method is valid, then my answer should be valid as well. – Eduard Lepner Jul 28 '17 at 11:02
  • There is not blocking resourse. I am debuging this and after a few items it stops and waits for items to execute. This is to slow – ThunD3eR Jul 28 '17 at 11:06
  • 3
    @Ra3IDeN Then your actual method isn't designed to run in parallel, and you need to fix *that*. – Servy Jul 28 '17 at 20:02
  • @Servy I think I did, Check my answer below. – ThunD3eR Jul 28 '17 at 20:05
  • 3
    @Ra3IDeN That answer *reduces* the amount of work that can be done in parallel, for the main work being done, and *incorrectly* parallelizes work that can't be done in parallel, resulting in unsafe code. – Servy Jul 28 '17 at 20:06
  • 1
    Finaly decided to accept this answer since it answers the question asked. However I will not use the solution since I need multithreading and this locks each thread. – ThunD3eR Aug 08 '17 at 19:29
  • 1
    How about `Task.WaitAll(data.Select(d => Task.Run(() => MyMethod(d, someParam)).ToArray()));` to retain parallelism? Haven't tested myself, but just an idea. – turkinator May 16 '19 at 21:51
  • One other thing to keep in mind is that WaitAll does not give you a way to stop before everything runs. In a normal Parallell.ForEach, you can loopState.Stop() and exit early if you run into a fatal case. That can be really nice when handling many operations. – MBentley Aug 13 '21 at 14:13
  • You can use Parallel.ForEachAsync https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreachasync?view=net-6.0 – Mihaimyh Aug 24 '22 at 15:55