5

I'm using HttpClient to asynchronously make many requests to an external api. I wait for all the requests to complete, then use the responses in other code. My problem is that if I make too many requests, my code throws an exception when I use the Task.WhenAll to wait.

This code will ultimately be run in parallel, by which I mean that I'll be doing many sets of these async requests at the same time (i.e. 10 sets of 200 async requests). I've instantiated an HttpClient that I'm using with .NET 4.5 async/await modifiers like this:

using (var client = new HttpClient())
{
    // make a list of tasks
    List<Task<HttpResponseMessage>> taskList;
    List<MyData> replies;
    for (int i = 0; i < MAX_NUMBER_REQUESTS; i++)
    {
        taskList.Add(client.GetAsync(externalUri);
    }

    List<HttpResponseMessage> responses = await Task.WhenAll(taskList);

    // read each response after they have returned
    foreach (var response in responses)
    {
        var reader = new System.IO.StreamReader(await response.Content.ReadAsStreamAsync());
        replies.Add(JsonConvert.DeserializeObject<MyData>(reader.ReadToEnd()));
        reader.Close();
} 

    foreach (var reply in replies)
    {
        // do something with the response from the external api call...
    }
}

I keep getting a TaskCanceledException. After looking into this I saw this is possibly a timeout issue. I have no idea how to fix it. I attempted to batch my requests at 100 requests before using the Task.WhenAll and repeating, which worked. Then when I ran that in parallel with three sets of 100 requests that fails. Am I missing something, does anyone have any insight into this?

user2503038
  • 193
  • 1
  • 2
  • 8
  • How are you running the three sets in parallel? – Adam Miller Feb 04 '14 at 16:42
  • a timer is fired to begin the entire process, which then calls a timer on each object in our dataset. Each object in our data set then calls this function to retrieve its data through api calls. If thats not clear enough I can provide the pseudo code. – user2503038 Feb 04 '14 at 16:53
  • TPL Dataflow works great with HttpClient when you need to limit concurrent requests. Launching hundreds of requests at once is probably not a good idea. – Cory Nelson Feb 04 '14 at 17:24

2 Answers2

10

Adjust ServicePointManager.DefaultConnectionLimit. For massively-concurrent requests, you can just set it to int.MaxValue.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • 1
    I raised it to 10000 (before instantiating the httpclient) and it didn't help. – user2503038 Feb 04 '14 at 17:40
  • 1
    This is a good post explaining why this would help. [Understanding MaxServicePointIdleTime and DefaultConnectionLimit](https://blogs.msdn.microsoft.com/jpsanders/2009/05/20/understanding-maxservicepointidletime-and-defaultconnectionlimit/) – ohw Apr 26 '17 at 01:32
2

The TaskCanceledException is likely due to your HttpClient being disposed before your a request in question has been completed. I suspect the code that gives you the exception is not the same as the sample you've posted, as it won't compile?

The following works fine for me with up to 2000 requests:

using (var client = new HttpClient())
{
    List<Task<HttpResponseMessage>> taskList = new List<Task<HttpResponseMessage>>();
    List<MyData> replies = new List<MyData>();
    for (var i = 0; i < MAX_NUMBER_REQUESTS; ++i)
    {
        taskList.Add(client.GetAsync(externalUrl));
    }
    var responses = await Task.WhenAll(taskList);

    foreach (var m in responses)
    {
        using (var reader = new StreamReader(await m.Content.ReadAsStreamAsync()))
        {
            replies.Add(JsonConvert.DeserializeObject<MyData>(reader.ReadToEnd()));
        }
    }

    foreach (var reply in replies)
    {
        // TODO:        
    }
}

So, there is likely some other issue that you've got that you haven't provided enough detail for anyone to figure out.

In addition to the issues I commented on using Parallel.ForEach: the ForEach is blocking so your UI will be blocked.

Peter Ritchie
  • 35,463
  • 9
  • 80
  • 98
  • Now assume you use this code in a web crawler and you want to download millions of urls from tausends of differents sites, Do you think it is logical to start millions of tasks at once. (I liked this. It is easier to comment on someone's answer than writing one :) ) – L.B Feb 04 '14 at 21:18
  • That's not the point, the point is the code works as described and does not show the errors he detailed. To "fix" his problem, we need more info. – Peter Ritchie Feb 04 '14 at 21:20