9

I've been trying to learn C#'s async with HttpClient and have come across an issue where I can't figure out what's happening.

What I want to do:

  • Send multiple HttpClient requests in one go and have the responses processed individually as soon as they return. This process is repeated in a loop forever (i.e. a new batch of requests sent off every few seconds). The processing done on the results are not CPU intensive and take a few milliseconds at most per response.
  • Quantity is 5-50 posts in any one instance
  • Don't care about the order they come back but each request/response must process as fast as possible without blocking.

The problem:

  • Whenever I send a list of requests together, the responses seem to take ever so slightly longer, but they should each be approximately the same. e.g.:

    Time taken: 67.9609 milliseconds
    Time taken: 89.2413 milliseconds
    Time taken: 100.2345 milliseconds
    Time taken: 117.7308 milliseconds
    Time taken: 127.6843 milliseconds
    Time taken: 132.3066 milliseconds
    Time taken: 157.3057 milliseconds

When they should all be just around 70ms... This repeats for every new batch; the first one is quick, then each consecutive one slows down.

My Code
There are 3 main parts. I've simplified it as best I can to focus on where the problem is occuring (Downloader.cs is what I timed)

  • Downloader.cs This contains the method that does the actual http query:

    public async Task<string> DownloadData(string requestString)
    {
        HttpResponseMessage response;
        response = await this.httpClient.PostAsync( url, new StringContent( requestString ));
        string returnString = await response.Content.ReadAsStringAsync();
        return returnString;
    }
    
  • QueryAgent.cs This initialises a new Downloader and contains the main async looping caller function and the async method it calls in a loop.
    I need to send of new requests as soon as possible depending on the result of the last set of responses (basically, the rate at which requests go out can change so I can't do a Task.delay() at the end of the while loop)

    public async Task DownloadLoopAsync() 
    {
        while ( this.isLooping )
        {
            // I put this here to stop it using lots of cpu. 
            // Not sure if it is best practice.
            // advice on how to do this kind of loop would be appreciated.
            await Task.Delay( 5 );  
    
            foreach ( string identifier in this.Identifiers )
            {
                string newRequestString = this.generateRequestString(identifier );
                Task newRequest = RequestNewDataAsync(newRequestString);
            }
        }
    }     
    
    public async Task RequestNewDataAsync(string requestString)
    {
        string responseString = await this.downloader.DownloadData(requestString);
        this.ProcessedData = this.ProcessData(responseString);
    }
    
  • Program.cs This is just a console program that calls a Task once on the async function in QueryAgent and then a Console.Readline() at the end to stop the program exiting. I initialize the HttpClient in here too. I call DownloadLoop() once, as such:

    QueryAgent myAgent = new QueryAgent(httpClient);
    Task dataLoop = myAgent.DownloadLoopAsync();   
    Console.Readline();
    

I have a suspicion it has something to do with the way that I'm calling a new Task in the for loop in DownloadLoopAsync(), however this doesn't make sense because I assumed that each new Task would be processed by itself.

Any advice is greatly appreciated as I have been hitting brick wall after brick wall trying to get this working as intended.

thank you.

EDIT 1
Just had a mess around in the calling Task loop. I changed:

foreach(string identifier in this.Identifiers)
{
    string newRequestString = this.generateRequestString(identifier );
    Task newRequest = RequestNewDataAsync(newRequestString);
}

to

foreach(string identifier in this.Identifiers)
{
    string newRequestString = this.generateRequestString(identifier );
    Task newRequest = RequestNewDataAsync(newRequestString);
    await Task.Delay(1);
}

which now makes it work as expected:

Time taken: 71.0088 milliseconds
Time taken: 65.0499 milliseconds
Time taken: 64.0955 milliseconds
Time taken: 64.4291 milliseconds
Time taken: 63.0841 milliseconds
Time taken: 64.9841 milliseconds
Time taken: 63.7261 milliseconds
Time taken: 66.451 milliseconds
Time taken: 62.4589 milliseconds
Time taken: 65.331 milliseconds
Time taken: 63.2761 milliseconds
Time taken: 69.7145 milliseconds
Time taken: 63.1238 milliseconds

Now I'm even more confused why this appears to work!

EDIT 2
The timing code exists as follows inside Downloader.cs:

public async Task<string> DownloadData(string requestString)
{
    HttpResponseMessage response;
    Stopwatch s = Stopwatch.StartNew();
    response = await this.httpClient.PostAsync( url, new StringContent( requestString ));
    double timeTaken = s.Elapsed.TotalMilliseconds;
    Console.WriteLine( "Time taken: {0} milliseconds", timeTaken );
    string returnString = await response.Content.ReadAsStringAsync();
    return returnString;
}

Although the await Task.Delay( 1 ); seems to "do the trick", it is not an ideal solution nor do I understand why it's working - perhaps somehow it gives the CPU time to initialize the RequestNewDataAsync Task and prevents a lockup of some sort concurring? I can only guess... It also means that it limits the loop to 1000Hz, which is fast, but also a limit since Task.Delay() only accepts integer values, of which 1 is the minimum.

EDIT 3
Doing a bit more reading (I am still relatively new to all this!), maybe this is something best left to using the Threadpool (10 to 100s of short lived tasks)? However would this would create excessive (1MB per new task?) overhead; or is that something different again.

EDIT 4
Did a little bit more experimenting putting the Task.Delay(1) in different places.

  • [slowdown occurs] Placing before or after httpClient.PostAsync() in Downloader.cs
  • [slowdown occurs] Placing before or after the await call in RequestNewDataAsync()
  • [no slowdown / expected performance] Placing before or after the Task call in DownloadLoopAsync()
Michel de Nijs
  • 406
  • 5
  • 16
undercurrent
  • 285
  • 1
  • 2
  • 12
  • where are you sending the requests to? is it possible you are just putting load of the server, hence the slower response? – Ewan Mar 25 '15 at 11:58
  • `HttpClient` cannot be used concurrently, it is not thread-safe. You also never await the resulting task from `RequestNewDataAsync` so you are sending more requests before the earlier ones have completed, putting more and more load on the server. – Lukazoid Mar 25 '15 at 12:01
  • your approach seems slightly non-standard. I would suggest you have a list of tasks which you add to and then await.all rather than the loop – Ewan Mar 25 '15 at 12:01
  • HttpClient should be ok as long as you make a new one each time surely – Ewan Mar 25 '15 at 12:04
  • @Ewan Exactly, however the OP is creating a single `HttpClient` which is used concurrently in `QueryAgent` – Lukazoid Mar 25 '15 at 12:05
  • I don't quite understand how you got to your suspicion since you said that you timed Downloader.cs which contains the DownloadData method. The method takes longer to execute - however this is unrelated on how/when it is called in the loop, isn't it? Thus a higher server load seems more plausible to me, even though i wouldn't have expected that 7 requests already cause a 90millisecond delay. – H W Mar 25 '15 at 12:12
  • Ewan, I am confident that the number of requests shouldn't be an issue. My original approach was indeed to create a new HttpClient for every request. This worked well! The only problem is every new request created a new connection with the server nullifying the use of a keep-alive header. So, instead of taking 100ms it would take 1000ms every request. I could send 100s of these and each would process nicely without this slowdown problem. This response time is too slow and there may be overhead by creating so many new HttpClients. I've also read this is how HttpClient is intended to be used – undercurrent Mar 25 '15 at 12:17
  • @Anthony: do the execution times change when you use await `Task.Delay( 5000 );` instead of `await Task.Delay( 5 );`? – H W Mar 25 '15 at 12:24
  • `I need to send of new requests as soon as possible depending on the result of the last set of responses (basically, the rate at which requests go out can change so I can't do a Task.delay() at the end of the while loop)` - doesn't that mean that you want to fire several Tasks `newRequest` and await all of them, before continuing with the next loop? In this case, check out http://stackoverflow.com/questions/12337671/using-async-await-for-multiple-tasks for an example how to await multiple parallel running tasks. – H W Mar 25 '15 at 12:27
  • H W, Just tried it with 1000, 5000. No change; the consecutive slowdown is still there and response times are the same. EDIT: H W, I still need to send new requests even if the previous ones have not returned. Basically I'm trying to keep a steady min/max rate of requests going. So I could send new requests out at 1Hz then increase that to 10Hz before any have returned. If timeouts happen (which they may) I just deal with that in a cancellation token (currently 5s). Basically I don't want new requests dependent on responses to previous requests. – undercurrent Mar 25 '15 at 12:30
  • 1
    @Anthony: Have you set `ServicePointManager.DefaultConnectionLimit`? Also, please post a complete minimal repro. – Stephen Cleary Mar 25 '15 at 12:31
  • Stephen, I have. ServicePointManager.DefaultConnectionLimit = 5000; ServicePointManager.UseNagleAlgorithm = false; ServicePointManager.Expect100Continue = false; – undercurrent Mar 25 '15 at 12:36
  • As a comparison, when the connection limit is set to 2; the consecutive response times are: 68.0528 milliseconds, 76.4045 milliseconds,132.8217 milliseconds, 142.4365, 196.8486 milliseconds, 217.773 milliseconds. So there's a even more significant slowdown dependent on that default limit. 26 requests results in an accumulation from 68ms up to 865ms. 5000 should be more than enough to prevent this? – undercurrent Mar 25 '15 at 12:42
  • Since you use the same httpClient to process multiple requests, it seems like opening more requests on the same httpClient slows the processing down quite a bit (which would be a good reason why the default connectionLimit is as low as 2). The times that you posted in your comment right above mine seem quite reasonable: 68,142,196 and 72,132,217, are each *about* 70 milliseconds apart, which is logical since one request takes about 70 milliseconds, you fire them all at once and they are processed in batches of two. – H W Mar 25 '15 at 12:51
  • To confirm the assumption that the slower response times are caused by an overloaded httpClient - could you add a loop in program.cs which initializes a new httpClient and QueryAgent object in each runthrough and fire DownloadLoopAsync() on each of them? The response times should start over at 70ms with every new QueryAgent, even though there are still requests running in the background on the older QueryAgents. – H W Mar 25 '15 at 12:56
  • @HW have a look at the recent edit. HW, I have tried that previously (new HttpClient for multiple QueryAgents) and the same thing occurred (slowdown). – undercurrent Mar 25 '15 at 13:00
  • your edit confused me as well. Could you post the code on how/where exactly you measure the times? – H W Mar 25 '15 at 13:12
  • @HW, OP updated with the timing code – undercurrent Mar 25 '15 at 13:22
  • ConfigureAwait(false) is missing – Guillaume Apr 08 '15 at 14:45
  • @guillaume I tried changing that previously but it made no difference. Is there a particular await that should have it's ConfigureAwait set to false? Are all awaits at all call depths to be set as false? The 1ms delay seems to have fixed all my issues so I'm accepting that the server isn't able to handle a faster rate for any one connection. – undercurrent Apr 09 '15 at 12:08
  • You should remove `ConfigureAwait(false)` only when the continuation is required to run on the original context (UI Thread/HttpContext/...). In all other case (the vast majority) you should add a `ConfigureAwait(false)` – Guillaume Apr 09 '15 at 12:32

2 Answers2

6

When you do an await, the subsequent code gets executed only after what you 'awaited' gets completed.

To send multiple queries concurrently, you have to create new Task for each query and then await all of them together.

var tasks = new List<Task>();

foreach (string identifier in Identifiers)
{
    var newRequestString = generateRequestString(identifier);
    var newRequest = RequestNewDataAsync(newRequestString);
    tasks.Add(newRequest);
}

Task.WhenAll(tasks); // synchronous wait for all tasks to complete
PiotrWolkowski
  • 8,408
  • 6
  • 48
  • 68
shr
  • 875
  • 10
  • 20
  • shr, that was my understanding. Have a look at what I wrote in DownloadLoopAsync() . I create a new Task for each new identifier without awaiting. Wouldn't this mean each one would be processing asyncronously? Is what you are saying akin to - create a collection of tasks and then process them in one go. So like in https://msdn.microsoft.com/en-us/library/jj155756.aspx I have to create a list of queries, but instead of using a WhenAny(), just call the .ToList() and let them process on their own? – undercurrent Mar 25 '15 at 12:20
  • Anthony, In your DownloadLoopAsync(), a new task is not created. In fact, since the call to RequestNewDataAsync(newRequestString); is not awaited, it will be a synchronous call. Yes, you are right, to get send concurrent requests, you have to create a collection of tasks. You can await the collection of tasks together using WhenAll(). – shr Mar 25 '15 at 12:58
  • @Anthony, would you please post the complete code including how you are measuring the times. – shr Mar 25 '15 at 13:21
  • Based on your observations, it looks like the delay is from your server. When you put the delay within the loop, the requests to the server is spaced. Delays after posting the request has no impact on the timings. To confirm this, would you please record the absolute timing of when the post request is done ? You can you use another Stopwatch that is created in the beginning . If this hypothesis is correct, then you will notice that when requests all go together (with some execution delays), the server response time increases for subsequent requests. – shr Mar 26 '15 at 04:42
  • Interesting, this may then be what is happening! How do you mean "absolute timing of when the post request is done"; how does this differ to what I've already posted. Are you saying to time the entire foreach loop without any delay in it? – undercurrent Mar 26 '15 at 10:33
  • By absolute I meant time from the beginning of execution. Right now, you are measuring the time required to execute the post request call. It would be good to get a 'time stamp' of when the post request call is made. Hence, for example if you start your stopwatch at the beginning, print out the time for every post request that is made. You may notice that the first post was made at say 200th millisecond and the second one was issued at 205th millisecond, the third one was at 210th and so on. Theoretically with no execution delays, all the 50 requests should have gone at the 200th millisecond. – shr Mar 27 '15 at 05:22
1

Most likely you are just saturating your network or the remote server with requests. Your infinite loop is not waiting for the previous batch to complete before starting the next batch, so over time you are issuing batches faster than they can complete, causing a backlog.

Here's a modified loop that waits for a batch to complete before starting the next batch. Notice it no longer awaits a Task.Delay and instead awaits the collection of download tasks.

public async Task DownloadLoopAsync() 
{
    while ( this.isLooping )
    {
      var tasks = this
        .Identifiers
        .Select(id => RequestNewDataAsync(this.generateRequestString(id));

      await Task.WhenAll(tasks);
    }
}

Note you have a race condition in how you store the results of the processed data. This may or may not be an issue for you:

public async Task RequestNewDataAsync(string requestString)
{
    string responseString = await this.downloader.DownloadData(requestString);

    // Note: you have a race condition here.
    // multiple downloads could complete simultaneously and try to concurrently
    // assign to ProcessedData.  This may or may not be an issue for you.
    this.ProcessedData = this.ProcessData(responseString);
}

Also I modified your main program to await the final batch to complete after the user presses a key:

QueryAgent myAgent = new QueryAgent(httpClient);
Task dataLoop = myAgent.DownloadLoopAsync();   
Console.Readline();
myAgent.isLooping = false;
dataLoop.Wait(); // let the last set of downloads complete
Brandon
  • 38,310
  • 8
  • 82
  • 87
  • The original code uses concurrent dictionaries; I've just simplified to post here. Using Task.WhenAll makes sense, however doesn't this require a batch to be responded to before any new requests can be sent out? If I couple transmission with reception then everything is now dependent on the latency between my request, the server responding, sending it back to me, and me downloading it. If I want 10Hz datarate, and know the server can handle this speed of requests, but my ping only allows 3hz, how would one accomplish this? Perhaps it's not the server end, but my end that's limited. – undercurrent Mar 26 '15 at 00:41