22

I would like to call my API in parallel x number of times so processing can be done quickly. I have three methods below that I have to call APIs in parallel. I am trying to understand which is the best way to perform this action.

Base Code

var client = new System.Net.Http.HttpClient();
client.DefaultRequestHeaders.Add("Accept", "application/json");

client.BaseAddress = new Uri("https://jsonplaceholder.typicode.com");
var list = new List<int>();

var listResults = new List<string>();
for (int i = 1; i < 5; i++)
{
    list.Add(i);
}

1st Method using Parallel.ForEach

Parallel.ForEach(list,new ParallelOptions() { MaxDegreeOfParallelism = 3 }, index =>
{
    var response = client.GetAsync("posts/" + index).Result;

    var contents =  response.Content.ReadAsStringAsync().Result;
    listResults.Add(contents);
    Console.WriteLine(contents);
});

Console.WriteLine("After all parallel tasks are done with Parallel for each");

2nd Method with Tasks. I am not sure if this runs parallel. Let me know if it does

var loadPosts = new List<Task<string>>();
foreach(var post in list)
{
    var response = await client.GetAsync("posts/" + post);

    var contents = response.Content.ReadAsStringAsync();
    loadPosts.Add(contents);
    Console.WriteLine(contents.Result);
}

await Task.WhenAll(loadPosts);

Console.WriteLine("After all parallel tasks are done with Task When All");

3rd Method using Action Block - This is what I believe I should always do but I want to hear from community

var responses = new List<string>();

var block = new ActionBlock<int>(
    async x => {
        var response = await client.GetAsync("posts/" + x);
        var contents = await response.Content.ReadAsStringAsync();
        Console.WriteLine(contents);
        responses.Add(contents);                
    },
    new ExecutionDataflowBlockOptions
    {
        MaxDegreeOfParallelism = 6, // Parallelize on all cores
    });

for (int i = 1; i < 5; i++)
{
    block.Post(i);
}

block.Complete();
await block.Completion;

Console.WriteLine("After all parallel tasks are done with Action block");
MikeLimaSierra
  • 799
  • 2
  • 11
  • 29
Learn AspNet
  • 1,192
  • 3
  • 34
  • 74
  • I am confused with the code block no 3. what is x? Is that name of function or annonymous method? – toha Jul 20 '22 at 14:56

3 Answers3

22

Approach number 2 is close. Here's a rule of thumb: I/O bound operations=> use Tasks/WhenAll (asynchrony), compute bound operations => use Parallelism. Http Requests are network I/O.

            foreach (var post in list)
            {
                async Task<string> func()
                {
                    var response = await client.GetAsync("posts/" + post);
                    return await response.Content.ReadAsStringAsync();
                }

                tasks.Add(func());
            }

            await Task.WhenAll(tasks);

            var postResponses = new List<string>();

            foreach (var t in tasks) {
                var postResponse = await t; //t.Result would be okay too.
                postResponses.Add(postResponse);
                Console.WriteLine(postResponse);
            }
Yuli Bonner
  • 1,189
  • 8
  • 12
  • @Yulli, But this not running in parallel. Aren't we wait for each response to be returned to call the next one in the loop? I thought Action block is the best to go for Http requests and Parallel foreach for compute operations? – Learn AspNet Oct 11 '19 at 22:35
  • 4
    Notice there is no await in front of the func() call. So no this code does not wait on any task inside of the first forloop. The local function is returning a Task. The tasks collection is a collection of Task that may or may not be complete. The await WhenAll creates an continuation that executes after all of the Task objects in tasks have completed. Async/await/WhenAll is about asynchrony not parallelism. Your confusion may be around the local function. async Task func() is a function definition, not a function call. – Yuli Bonner Oct 11 '19 at 22:53
  • @LearnAspNet When the tasks.Add(func()) is called, the task is started immediately and returns a token (which is a task) stored in the tasks list. – Thai Anh Duc Oct 15 '19 at 02:09
  • @YuliBonner What if I want to include number of cores to be used. Also in func, if you have await client.GetAsync, Won't it wait for the the operation to complete before next task can be added to tasks list/for loop to continue? – Learn AspNet Oct 15 '19 at 17:36
  • 1
    As far as cores go, you're not going to get any benefit from using additional cores. Firing off an http request takes very little cpu. All the the processing is happening on the server. Tasks can use multiple threads from the ThreadPool, but in this in this case, it will, almost certainly just reuse the same thread, because there is very little processing happening. As far as func goes, the call to func is not awaited. All of the http requests will be concurrently outstanding. All of the Tasks in tasks are running when it is passed into WhenAll. – Yuli Bonner Oct 15 '19 at 17:57
  • @YuliBonner I am still trying to understand why is the third method not the best method? When should we use action blocks then. – Learn AspNet Oct 17 '19 at 15:36
  • In response to your last comment, I would say DataFlow is overkill for this example. It can be useful to build complex pipelines with a mix of sequential & parallel jobs but issuing concurrent HTTP requests is a basic use case for the built-in TPL (it doesn't need a NuGet package to be installed, unlike DataFlow). – jamespconnor Oct 17 '19 at 16:24
  • @LearnAspNet version 2 is simpler code and makes it easier to understand. If you want to limit the number of parallel requests you can just add a fixed number of items in the list and wait on them, then when those are done, continue with the next batch. – Claudiu Guiman Oct 18 '19 at 16:31
  • Is there a difference between the func() function being inside the foreach loop and being a method of its own? – Josh Monreal Mar 10 '21 at 21:40
  • The 1st for loop will block the UI thread until all iterations are complete even though it is not blocking for response. So if there are 50000 iterations, then the UI will appear blocked for some time. – variable Aug 02 '21 at 03:38
  • Please, can I have more information about this Rule? @YuliBonner – BorisD Aug 19 '22 at 09:38
  • Also see this https://mortaza-ghahremani.medium.com/task-whenall-vs-parallel-foreach-816d1cb0b7a – BorisD Aug 19 '22 at 09:46
5

I made a little console app to test all the Methods at pinging API "https://jsonplaceholder.typicode.com/todos/{i}" 200 times. @MikeLimaSierra Method 1 or 3 were the fastest!

Method DegreeOfParallelism Time
Not Parallel n/a 8.4 sec
@LearnAspNet (OP) Method 1 2 5.494 sec
@LearnAspNet (OP) Method 1 30 1.235 sec
@LearnAspNet (OP) Method 3 2 4.750 sec
@LearnAspNet (OP) Method 3 30 1.795 sec
@jamespconnor Method n/a 21.5 sec
@YuliBonner Method n/a 21.4 sec
Ted M
  • 69
  • 1
  • 4
  • I don't see any methods from MikeLimaSierra, so your results are very confusing. What did you test exactly ? Do you have the source for that ? – CornelC Jun 14 '21 at 14:10
  • Correction: I meant the OP (@LearnAspNet)... I guess the OP was edited by MikeLimaSierra – Ted M Aug 05 '21 at 19:21
3

I would use the following, it has no control of concurrency (it will dispatch all HTTP requests in parallel, unlike your 3rd Method) but it is a lot simpler - it only has a single await.

var client = new HttpClient();
var list = new[] { 1, 2, 3, 4, 5 };
var postTasks = list.Select(p => client.GetStringAsync("posts/" + p));
var posts = await Task.WhenAll(postTasks);
foreach (var postContent in posts)
{
    Console.WriteLine(postContent);
}
jamespconnor
  • 1,382
  • 14
  • 29
  • My third method will dispatch everything at same time – Learn AspNet Oct 17 '19 at 15:27
  • Your third method won't dispatch everything at the same time (in your case, it will only do that if you have less than 6 items) - you set `MaxDegreeOfParallelism = 6` so it will only have a maximum of 6 items in flight at a time. – jamespconnor Oct 17 '19 at 16:20
  • Yeah, that is what I meant, that it will send 6 requests at the same time. Do you ever use Action blocks or just use Task.WhenAll? – Learn AspNet Oct 17 '19 at 16:27
  • I like your code snippet because it keeps things simple, but there's a very subtle issue with your code: Task.WhenAll(IEnumerable) is allowed to enumerate several times over the enumeration, that means each time the enumerator is restarted the Linq query will re-run, issuing more HttpRequests than desired. As a simple fix you should convert postTasks to an array (or list) before passing it to Task.WhenAll: `var posts = await Task.WhenAll(postTasks).ToArray();` – Claudiu Guiman Oct 18 '19 at 16:27
  • 1
    @ClaudiuGuiman - you raise a good point but do you have a source for `Task.WhenAll(IEnumerable) is allowed to enumerate several times over the enumeration` ? This post (by @stephen-cleary - well known for his TPL posts/knowledge) seems to contradict that: https://stackoverflow.com/a/43762566/1238322 – jamespconnor Oct 19 '19 at 15:29
  • @ClaudiuGuiman *Task.WhenAll(IEnumerable) is allowed to enumerate several times over the enumeration.* [citation needed] – Theodor Zoulias Oct 24 '19 at 01:31
  • @jamespconnor you're right, I misspoke - In the past I ran into issues because I manually iterated through the enumeration for example to re-fire tasks that failed so I got confused here and simplified the scenario by saying Task.WhenAll also can do that. I still think it's a good rule to follow, but in this exact example it won't change the behavior in any way. – Claudiu Guiman Oct 25 '19 at 14:58