1

I am playing around the parallel execution of tasks in .Net. I have implemented function below which executes list of tasks in parallel by using Task.WhenAll. I also have found that there are two options I can use to add tasks in the list. The option 1 is to use Task.Run and pass Func delegate. The option 2 is to add the result of the invoked Func delegate.

So my questions are:

  1. Task.Run (Option 1) takes additional threads from thread pool and execute tasks in them by passing them to Task.WhenAll. So the question is does Task.WhenAll run each task in the list asynchronously so the used threads are taken from and passed back to thread pool or all taken threads are blocked until execution is completed (or an exception raised)?
  2. Does it make any difference if I call Task.Run passing synchronous (non-awaitable) or asynchronous (awaitable) delegates?
  3. In the option 2 - theoretically no additional threads taken from thread pool to execute Tasks in the list. However, the tasks are executed concurrently. Does Task.WhenAll creates threads internally or all the tasks are executed in a single thread created by Task.WhenAll? And how SemaphoreSlim affects concurrent tasks?

What do you think is the best approach to deal with asynchronous parallel tasks?

    public static async Task<IEnumerable<TResult>> ExecTasksInParallelAsync<TSource, TResult>(IEnumerable<TSource> source, Func<TSource, Task<TResult>> task, int minDegreeOfParallelism = 1, int maxDegreeOfParallelism = 1)
    {
        var allTasks = new List<Task<TResult>>();

        using (var throttler = new SemaphoreSlim(minDegreeOfParallelism, maxDegreeOfParallelism))
        {
            foreach (var element in source)
            {
                // do an async wait until we can schedule again
                await throttler.WaitAsync();

                Func<Task<TResult>> func = async () =>
                {
                    try
                    {
                        return await task(element);
                    }
                    finally
                    {
                        throttler.Release();
                    }
                };

                //Option 1
                allTasks.Add(Task.Run(func));
                //Option 2
                allTasks.Add(func.Invoke());
            }

            return await Task.WhenAll(allTasks);
        }
    }

The function above is executed as

  [HttpGet()]
    public async Task<IEnumerable<string>> Get()
    {using (var client = new HttpClient())
        {
            var source = Enumerable.Range(1, 1000).Select(x => "https://dog.ceo/api/breeds/list/all");
            var result = await Class1.ExecTasksInParallelAsync(
                source, async (x) =>
                {
                    var responseMessage = await client.GetAsync(x);

                    return await responseMessage.Content.ReadAsStringAsync();
                }, 100, 200);

            return result;
        }

}

Dremlin
  • 11
  • 3
  • Did you read existing https://www.bing.com/search?q=c%23+task+thread questions? (And you really should ask one question per post - otherwise it is way too broad for SO) – Alexei Levenkov Mar 10 '18 at 02:33
  • I read MSDN and many other articles but did find answers for my questions. Why do you think my questions are too broad? I am asking particular questions which regard the code snippets. I am not asking something like what is better Task or Thread, right? – Dremlin Mar 10 '18 at 02:38
  • 5 > 1 - counting question marks is easy estimate to see if you asked too many questions in one post (and you also asking for opinions - "What do you think is the best approach to deal with asynchronous parallel tasks?" which is another off-topic reason by itself) – Alexei Levenkov Mar 10 '18 at 03:45
  • "What do you think is the best approach to deal with asynchronous parallel tasks?" - I am asking about approaches from the listed options above or if anyone else has better one. – Dremlin Mar 10 '18 at 03:49

1 Answers1

0

Option 2 tested better

I ran a few tests using your code and determined that option 2 is roughly 50 times faster than option 1, on my machine at least. However, using PLINQ was even 10 times faster than option 2.

Option 3, PLINQ, is even faster

You could replace that whole mess with a single line of PLINQ:

return source.AsParallel().WithDegreeOfParallelism(maxDegreeOfParallelism)
    .Select( s => task(s).GetAwaiter().GetResult() );

Oops... option 4

Turns out my prior solution would reduce parallelism if the task was actually async (I had been testing with a dummy synchronous function). This solution fixes the problem:

var tasks = source.AsParallel()
    .WithDegreeOfParallelism(maxDegreeOfParallelism)
    .Select( s => task(s) );
await Task.WhenAll(tasks);
return tasks.Select( t => t.Result );

I ran this on my laptop with 10,000 iterations. I did three runs to ensure that there wasn't a priming effect. Results:

Run 1
Option 1: Duration: 13727ms
Option 2: Duration: 303ms
Option 3 :Duration: 39ms
Run 2
Option 1: Duration: 13586ms
Option 2: Duration: 287ms
Option 3 :Duration: 28ms
Run 3
Option 1: Duration: 13580ms
Option 2: Duration: 316ms
Option 3 :Duration: 32ms

You can try it on DotNetFiddle but you'll have to use much shorter runs to stay within quota.

In addition to allowing very short and powerful code, PLINQ totally kills it for parallel processing, as LINQ uses a functional programming approach, and the functional approach is way better for parallel tasks.

John Wu
  • 50,556
  • 8
  • 44
  • 80
  • Thank you John for the answer. Unfortunatley, your code does not work with asynchronous tasks. I tried to use it within Asp .Net Core controller as in the second snippet in the topic by replacing ExecTasksInParallelAsync by ExecTasksInParallelAsync3 but getting exception "Cannot access a disposed object. Object name: 'System.Net.Http.HttpClient'." HttpClient has been disposed. Another question is how effectively PLinq utilizes threads taken from thread pool? – Dremlin Mar 10 '18 at 03:37
  • Interesting. I copied your function (I changed the URL to `www.google.com`) and cannot reproduce. Sure there isn't something else going on? BTW `HttpClient` [isn't meant to be disposed](https://stackoverflow.com/questions/15705092/do-httpclient-and-httpclienthandler-have-to-be-disposed) with each and every use. – John Wu Mar 10 '18 at 03:46
  • If you take get action method and replace "await Class1.ExecTasksInParallelAsync" by ExecTasksInParallelAsync3, you will see the error in WebApi2 or Asp .Net core controller. BTW, I am not disposing httpclient after each request. That's why it has global scope within the whole controller. – Dremlin Mar 10 '18 at 04:28
  • Apart from that you invoke awaitable task synchronously what causes blocking root thread. – Dremlin Mar 10 '18 at 06:11
  • Edited based on [this other idea](https://stackoverflow.com/questions/16340498/transform-ienumerabletaskt-asynchronously-by-awaiting-each-task). Funny because I didn't think ASP.NET core had a "root thread," since [there is no SynchronizationContext](https://blog.stephencleary.com/2017/03/aspnetcore-synchronization-context.html). I could see some problems on .NET framework though. Which are you using? BTW if you don't dispose HttpClient, how is it you got that error? – John Wu Mar 10 '18 at 07:49
  • You are right, there is no SynchronizationContext in Asp .Net Core. However, when an http request in Asp .Net Core is being processed, at least a single thread is taken from the thread pool anyway, right? So by invoking a task synchronously you take a thread from thread pool and block it until it is completed. In other words, you do not call tasks asynchronously and leverage .Net asynchronous programming model. Regarding HttpClient I do not know so far why that happens because when I use my version of parallel execution tasks, HttpClient object is not disposed unexpectedly. – Dremlin Mar 10 '18 at 08:47
  • BTW, this https://stackoverflow.com/questions/41126283/combining-plinq-with-async-method thread explains that PLINQ is good only if you want to run tasks synchronously. – Dremlin Mar 10 '18 at 08:48
  • Please check [this](https://blog.stephencleary.com/2017/03/aspnetcore-synchronization-context.html) again, especially *ASP.NET Core does not have a SynchronizationContext, so await defaults to the thread pool context. So, in the ASP.NET Core world, asynchronous continuations may run on any thread, and they may all run in parallel.* Anyway, I got rid of the blocking stuff. – John Wu Mar 10 '18 at 09:56
  • It seems you do not understand the purpose of asynchronous calls. I can improve your code var tasks = source.AsParallel().WithDegreeOfParallelism(maxDegreeOfParallelism).Select( async s => await task(s) ); return await Task.WhenAll(tasks); But the problem in PLINQ that it takes additional threads from thread pool and blocks them. – Dremlin Mar 10 '18 at 10:59
  • I do understand (I read [there is no thread](https://blog.stephencleary.com/2013/11/there-is-no-thread.html) and all that) but that only takes you to a point. Everything changes when there is no synchronization context. Your code could be switching threads every time it sees an `await`, as there is no effort to route the continuation anywhere specific. – John Wu Mar 10 '18 at 11:26
  • BTW I think the issue with `Dispose()` might not be async but deferred execution. Try calling `ToList()` before exiting the using block. – John Wu Mar 10 '18 at 11:30