6

I was reading this post about Parallel.ForEach where it was stated that "Parallel.ForEach is not compatible with passing in a async method."

So, to check I write this code:

static async Task Main(string[] args)
{
    var results = new ConcurrentDictionary<string, int>();

    Parallel.ForEach(Enumerable.Range(0, 100), async index =>
    {
        var res = await DoAsyncJob(index);
        results.TryAdd(index.ToString(), res);
    });         

    Console.ReadLine();
}

static async Task<int> DoAsyncJob(int i)
{
    Thread.Sleep(100);
    return await Task.FromResult(i * 10);
}

This code fills in the results dictionary concurrently.

By the way, I created a dictionary of type ConcurrentDictionary<string, int> because in case I have ConcurrentDictionary<int, int> when I explore its elements in debug mode I see that elements are sorted by the key and I thought that elenents was added consequently.

So, I want to know is my code is valid? If it "is not compatible with passing in a async method" why it works well?

Dmitry Stepanov
  • 2,776
  • 8
  • 29
  • 45
  • 1
    *Don't*. `Parallel.ForEach` is made for CPU-intensive computations and doesn't recognize async methods. It doesn't *await* them, essentially converting them to `async void` fire-and-forget calls. Your method isn't async anyway, so it's not possible to say what the *correct* call would look like – Panagiotis Kanavos Nov 27 '18 at 10:01
  • @PanagiotisKanavos please post the source of your information. Thanks – Seabizkit Nov 27 '18 at 10:03
  • 1
    `I want to know is my code is valid?`No. `why it works well?` It doesn't but you don't realize it because a) it doesn't perform any async work. – Panagiotis Kanavos Nov 27 '18 at 10:04
  • 1
    `Parallel.ForEach` - "Arrange for a number of threads to perform the following work in parallel" - `async` - "Well, there's no useful work for *this thread* to do until something else completes this awaitable". Even if it did work (it doesn't, it's entirely synchronous as Panagitotis says), you're combining things in an odd way that overallocates resources and then ignores them. – Damien_The_Unbeliever Nov 27 '18 at 10:13
  • @Seabizkit the OP's link already explains it. There's no [Parallel.ForEach](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreach?view=netframework-4.7.2) overload that accepts a Task so even when the `async` keyword is added, the method is *not* awaited. This results in [an implicit async void delegate](https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/blob/master/AsyncGuidance.md#implicit-async-void-delegates) – Panagiotis Kanavos Nov 27 '18 at 10:14
  • @DmitryS what *do* you want to do? If you want to call 100 URLs 10 at a time you could use a TransformBlock with a DOP of 10. If you just want to fire some tasks based on a list of arguments you could use `list.Select(item=>Task.Run(...))` to fire all of them, then await them. – Panagiotis Kanavos Nov 27 '18 at 10:17
  • If you want to check the bad scenario with Parallel.ForEach, use a big `Task.Delay()`, eg for 10 seconds or more and do something that could throw after it, eg using a disposable object created before the `Parallel.ForEach`. Put that `Parallel.ForEach` in a console app's `void Main()` and watch how it *doesn't* wait 10 seconds before it returns. You'll end up with exceptions from the aborted waits or attempts to access the disposed resource – Panagiotis Kanavos Nov 27 '18 at 10:21
  • @PanagiotisKanavos thanks for the links!.. I'm going to re-read the one as its great with tips. butl not 100% convinced about "doesn't recognize async methods" say the method its calling a DB operation and it takes some OI time within the threaded Parallel operation wouldn't this benefit from async, as that thread can be returned to the thread pool? – Seabizkit Nov 27 '18 at 10:36
  • 1
    @Seabizkit there's nothing to convince about. There's no overload that accepts a Task. Without that, there's no way to `await` any async calls. `async` doesn't await anything, nor does it make anything run asynchronously. `async void` calls can't be awaited, that's why they are considered bugs outside event handlers. – Panagiotis Kanavos Nov 27 '18 at 10:37

3 Answers3

10

This code works only because DoAsyncJob isn't really an asynchronous method. async doesn't make a method work asynchronously. Awaiting a completed task like that returned by Task.FromResult is synchronous too. async Task Main doesn't contain any asynchronous code, which results in a compiler warning.

An example that demonstrates how Parallel.ForEach doesn't work with asynchronous methods should call a real asynchronous method:

    static async Task Main(string[] args)
    {
        var results = new ConcurrentDictionary<string, int>();

        Parallel.ForEach(Enumerable.Range(0, 100), async index =>
        {
            var res = await DoAsyncJob(index);
            results.TryAdd(index.ToString(), res);
        });  
        Console.WriteLine($"Items in dictionary {results.Count}");
    }

    static async Task<int> DoAsyncJob(int i)
    {
        await Task.Delay(100);
        return i * 10;
    }

The result will be

Items in dictionary 0

Parallel.ForEach has no overload accepting a Func<Task>, it accepts only Action delegates. This means it can't await any asynchronous operations.

async index is accepted because it's implicitly an async void delegate. As far as Parallel.ForEach is concerned, it's just an Action<int>.

The result is that Parallel.ForEach fires off 100 tasks and never waits for them to complete. That's why the dictionary is still empty when the application terminates.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • You said in the comment that I can use list.Select(item=>Task.Run(...)) Why should i use Task.Run() in Select method? Can I use just like this: var tasks = Enumerable.Range(0, 100).Select(async index => { var res = await DoAsyncJob(index); results.TryAdd(index.ToString(), res); }); await Task.WhenAll(tasks); – Dmitry Stepanov Nov 27 '18 at 12:10
  • Async is used to free a thread which would be waiting on "I/O" stuff, so in this case, would it not be freeing some threads since the operation in the foreach maybe db access for example... Are you saying that this would be the case if the caller aka Parallel.ForEach could return Task, and the fact that it doesn't... all the Task operations in Parallel are not able to implement the async lib correctly. Did i word that ok? – Seabizkit Nov 27 '18 at 12:12
  • @DmitryS because you didn't explain what you really want to do. If you wanted to process some data items in the background *without* blocking, you could use `.Select(..=>Task.Run..)`. The same would work if you wanted to test multiple algorithms and get the first result. – Panagiotis Kanavos Nov 27 '18 at 12:13
  • @Seabizkit `Parallel.ForEach` is specifically meant for data parallelism. That's why it doesn't provide any overloads for tasks, and actually uses the *current* thread for processing. It partitions the input data into as many partitions as there are cores and uses *one* task per partition to minimize cross-thread synchronization. It's a completely different problem from asynchronous or concurrent operations – Panagiotis Kanavos Nov 27 '18 at 12:19
  • @PanagiotisKanavos THANKS! i think that is getting a lot closer to the info I'm looking for. – Seabizkit Nov 27 '18 at 12:24
  • @PanagiotisKanavos Can you please have a look at this. https://stackoverflow.com/questions/53499688/use-task-run-inside-of-select-linq-method – Dmitry Stepanov Nov 27 '18 at 12:27
  • @DmitryS that's definitely *not* what I was talking about. – Panagiotis Kanavos Nov 27 '18 at 12:28
4

An async method is one that starts and returns a Task.

Your code here

Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
    var res = await DoAsyncJob(index);
    results.TryAdd(index.ToString(), res);
});        

runs async methods 100 times in parallel. That's to say it parallelises the task creation, not the whole task. By the time ForEach has returned, your tasks are running but they are not necessarily complete.

You code works because DoAsyncJob() not actually asynchronous - your Task is completed upon return. Thread.Sleep() is a synchronous method. Task.Delay() is its asynchronous equivalent.

Understand the difference between CPU-bound and I/O-bound operations. As others have already pointed out, parallelism (and Parallel.ForEach) is for CPU-bound operations and asynchronous programming is not appropriate.

Tim Rogers
  • 21,297
  • 6
  • 52
  • 68
2

If you already have asynchronous work, you don't need Parallel.ForEach:

static async Task Main(string[] args)
{

    var results = await new Task.WhenAll(
        Enumerable.Range(0, 100)
        Select(i => DoAsyncJob(I)));

    Console.ReadLine();
}

Regarding your async job, you either go async all the way:

static async Task<int> DoAsyncJob(int i)
{
    await Task.Delay(100);
    return await Task.FromResult(i * 10);
}

Better yet:

static async Task<int> DoAsyncJob(int i)
{
    await Task.Delay(100);
    return i * 10;
}

or not at all:

static Task<int> DoAsyncJob(int i)
{
    Thread.Sleep(100);
    return Task.FromResult(i * 10);
}
Paulo Morgado
  • 14,111
  • 3
  • 31
  • 59