3

I'm using Polly to handle some scenarios like request throttled and timeouts. The policies were added directly in the Startup.cs which would be like this :

var retries = //applying the retries, let say I set to 25 times with 10s delay. Total 250s.

serviceCollection
    .AddHttpClient<IApplicationApi, ApplicationApi>()
    .AddPolicyHandler((services, request) => GetRetryPolicy<ApplicationApi>(retries, services));

The Policy:

static IAsyncPolicy<HttpResponseMessage> GetRetryPolicy<T>(List<TimeSpan> retries, IServiceProvider services)
{
    return HttpPolicyExtensions
        .HandleTransientHttpError()
        .OrResult(msg => msg.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
        .WaitAndRetryAsync(retries,
            onRetry: (outcome, timespan, retryAttempt, context) =>
            {
                //do some logging
            }
}

In ApplicationApi.cs do something like this:

private readonly HttpClient _httpClient;
public ApplicationApi(HttpClient httpClient)
{
    _httpClient = httpClient;
}
       
public void CallApi()
{ 
      var url = "https://whateverurl.com/someapi"
      using (var request = new HttpRequestMessage(HttpMethod.Get, url))
      {
          var response = await _httpClient.SendAsync(request);
          var respMessage = await 
          response.Content.ReadAsStringAsync();
      }
}

Now let say I don't specify the HttpClient.Timeout, which then will use default timeout : 100s.

Now I have a problem with heavy throttling. Polly will retry until the throttling resolved, or it reach the max retry. But, the program will thrown an exception on the 10th retry since it already more than 100s elapsed on httpclient since the first request it got throttled.

Seems like the first http request that got throttled still on and not closed, or I may be wrong. What causing this? Is it a normal behavior of Polly retries? How can I make it close the connection on each retries so I don't have to set a very high HttpClient.Timeout value.

I also implemented the Polly timeout policy to cut request if more than some specified second then retry until it succeed. But the Polly behavior still like this. So I need to set httpclient timeout > total elapsed time on retries

**UPDATE Code updated. So I just realized there's using statement for the request.

***UPDATE I've created a repo that reproduce the behavior here : https://github.com/purnadika/PollyTestWebApi

purnadika
  • 282
  • 2
  • 12
  • I could not reproduce your problem. [Here is a working example](https://dotnetfiddle.net/FVjaik) (4 retries with 1 second response delay and 2 seconds Timeout on the HttpClient) – Peter Csala Jul 15 '22 at 08:33
  • By the way what exception are you receiving? `TaskCancelledException`? – Peter Csala Jul 15 '22 at 08:34
  • @PeterCsala Yes, I tested on console app but the behavior is different. My app using netcore. Let me try to create a dummy app hope it will reflect the situation. Yes, the error is System.Threading.Tasks.TaskCanceled Exception: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled. and System.AggregateException: One or more errors occurred. (A task was canceled.) ---> System.Threading.Tasks.TaskCanceled Exception: A task was canceled. --- End of inner exception stack – purnadika Jul 18 '22 at 05:05
  • Which dotNet core version are you using? – Peter Csala Jul 18 '22 at 06:29
  • @PeterCsala netcore 3.1 btw you can try it in my repo here https://github.com/purnadika/PollyTestWebApi – purnadika Jul 18 '22 at 06:39
  • I've checked it with .NET Core 3.1 and it works in the same as it worked with .NET 6. But if I change the HttpClient's Timeout to 1 second and the response delay to 500 milliseconds then in case .NET 6 it works fine, in case of .NET Core 3.1 it fails with the `TaskCancelledException`. Later today I'll check your repo – Peter Csala Jul 18 '22 at 08:14
  • 1
    I've have played with your code a little bit. I could reproduce the observed behaviour. I think I have a primary suspect. Tomorrow I continue my investigation. – Peter Csala Jul 18 '22 at 19:31

2 Answers2

6

The short answer is that your observed behaviour is due to fact how AddPolicyHandler and PolicyHttpMessageHandler work.

Whenever you register a new Typed HttpClient without any policy (.AddHttpClient) then you basically create a new HttpClient like this:

var handler = new HttpClientHandler();
var client = new HttpClient(handler);

Of course it is much more complicated, but from our topic perspective it works like that.

If you register a new Typed HttpClient with a policy (.AddHttpClient().AddPolicyHandler()) then you create a new HttpClient like this

var handler = new PolicyHttpMessageHandler(yourPolicy);
handler.InnerHandler = new HttpClientHandler();
var client = new HttpClient(handler);

So the outer handler will be the Polly's MessageHandler and the inner is the default ClientHandler.

Polly's MessageHandler has the following documentation comment:

/// <para>
/// Take care when using policies such as Retry or Timeout together as HttpClient provides its own timeout via
/// <see cref="HttpClient.Timeout"/>.  When combining Retry and Timeout, <see cref="HttpClient.Timeout"/> will act as a
/// timeout across all tries; a Polly Timeout policy can be configured after a Retry policy in the configuration sequence,
/// to provide a timeout-per-try.
/// </para>

By using the AddPolicyHandler the HttpClient's Timeout will act as a global timeout.


The solution

There is workaround, namely avoiding the usage of AddPolicyHandler.

So, rather than decorating your Typed Client at the registration time you can decorate only the specific HttpClient method call inside your typed client.

Here is a simplified example based on your dummy project:

  • ConfigureServices
services.AddHttpClient<IApplicationApi, ApplicationApi>(client => client.Timeout = TimeSpan.FromSeconds(whateverLowValue));
  • _MainRequest
var response = await GetRetryPolicy().ExecuteAsync(async () => await _httpClient.GetAsync(url));

Here I would like to emphasize that you should prefer GetAsync over SendAsync since the HttpRequestMessage can not be reused.

So, if you would write the above code like this

using (var request = new HttpRequestMessage(HttpMethod.Get, url))
{
   var response = await GetRetryPolicy().ExecuteAsync(async () => await _httpClient.SendAsync(request));
}

then you would receive the following exception:

InvalidOperationException: The request message was already sent. Cannot send the same request message multiple times.

So, with this workaround the HttpClient's Timeout will not act as a global / overarching timeout over the retry attempts.

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
-1

Polly works to retry the top-level HttpClient request, so HttpClient's timeout applies to all retries. That's the whole point of using Polly, to retry requests in a way that's transparent to the top-level code.

If retrying for over a minute failed to work, the retry policy isn't good enough. Retrying over and over with a fixed delay will only result in more 429 responses, as all the requests that failed will be retried at the same time. This will result in wave after wave of identical requests hitting the server, resulting in 429s once again.

To avoid this, exponential backoff and jitter are used to introduce an increasing random delay to retries.

From the linked sample:

var delay = Backoff.DecorrelatedJitterBackoffV2(
        medianFirstRetryDelay: TimeSpan.FromSeconds(1), 
        retryCount: 5);

var retryPolicy = Policy
    .Handle<FooException>()
    .WaitAndRetryAsync(delay);

The Polly page for the Jitter strategy explain how this works. The distribution graph of delays shows that even with 5 retries, the retry intervals don't clamp together.

This means there's less chance of multiple HttpClient calls retrying at the same time, resulting in renewed throttling

Polly Jitter retry distribution is roughly exponential

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • Yes I'm aware with the problem of fixed delay. Actually I already implemented with exponential delay. It's 10s, 20s, 30s,...ns. Still not solve the throttling on the last request. We will try to "honor" the Retry-After value given, which the will be dynamics. If the Polly retries still like this, then I need to set HttpClient.Timeout higher, or even infinite? – purnadika Jul 15 '22 at 13:53
  • @purnadika using a fixed exponential delay isn't enough because all requests that failed close together will retry at the same time, resulting in another wave of timeouts. That's why jitter is needed and why `Backoff.DecorrelatedJitterBackoffV2` was added and a lot of experimentation went into ensuring retries are not clumped together – Panagiotis Kanavos Jul 15 '22 at 14:01
  • @purnadika `If the Polly retries still like this, then I need to set HttpClient.Timeout higher, or even infinite?` first add jitter or better yet use `DecorrelatedJitterBackoffV2`. Then you have to investigate why there's so much throttling. Increasing the timeout won't solve the problem, it will only make it last longer. If you know the allowed rate, consider adding a [Rate Limiting](https://github.com/App-vNext/Polly#rate-limit) policy to *avoid* throttling in the first place – Panagiotis Kanavos Jul 15 '22 at 14:07
  • @purnadika another option would be to use the [Retry-After](https://github.com/App-vNext/Polly/issues/414) header returned with 429 responses, if it exists. You should still include jitter though, to avoid everything retrying at the same time – Panagiotis Kanavos Jul 15 '22 at 14:09
  • @purnadika in general, the best way to ensure high throughput is to ensure you don't end up in a throttling situation. That's why Rate Limiting should be tried first. There's also a proposal for [RateLimit response headers](https://tools.ietf.org/id/draft-polli-ratelimit-headers-00.html) which [some services like Github](https://stackoverflow.com/questions/16022624/examples-of-http-api-rate-limiting-http-response-headers) already implement. You can use these to limit your calls proactively. – Panagiotis Kanavos Jul 15 '22 at 14:11
  • `so HttpClient's timeout applies to all retries` does it means what I've described above? the timeout start from the very beginning request that got throttled and still counting until it reached the limit or the max retry, isn't it? Yes I'm planning to update and enhance current code. Thanks for your suggestions. I still need the answer of my current problem with http timeout, and also wait for another response from all the master here – purnadika Jul 18 '22 at 06:55