We are using Azure Table Storage and are getting occasional 408 Timeouts when performing an InsertOrMerge operation. In this case we would like to retry, but it appears that the retry policy is not being followed for these errors.
This is a class we use to handle the table interaction. The method GetFooEntityAsync tries to retrieve the entity from Table Storage. If it cannot, it creates a new FooEntity and adds it to the table (mapping to a FooTableEntity).
public class FooTableStorageBase
{
private readonly string tableName;
protected readonly CloudStorageAccount storageAccount;
protected TableRequestOptions DefaultTableRequestOptions { get; }
protected OperationContext DefaultOperationContext { get; }
public CloudTable Table
{
get
{
return storageAccount.CreateCloudTableClient().GetTableReference(tableName);
}
}
public FooTableStorage(string tableName)
{
if (String.IsNullOrWhiteSpace(tableName))
{
throw new ArgumentNullException(nameof(tableName));
}
this.tableName = tableName;
storageAccount = CloudStorageAccount.Parse(ConnectionString);
ServicePoint tableServicePoint = ServicePointManager.FindServicePoint(storageAccount.TableEndpoint);
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.ConnectionLimit = 100; // Increasing connection limit from default of 2.
DefaultTableRequestOptions = new TableRequestOptions()
{
PayloadFormat = TablePayloadFormat.JsonNoMetadata,
MaximumExecutionTime = TimeSpan.FromSeconds(1),
RetryPolicy = new OnTimeoutRetry(TimeSpan.FromMilliseconds(250), 3),
LocationMode = LocationMode.PrimaryOnly
};
DefaultOperationContext = new OperationContext();
DefaultOperationContext.Retrying += (sender, args) =>
{
// This is never executed.
Debug.WriteLine($"Retry policy activated in {this.GetType().Name} due to HTTP code {args.RequestInformation.HttpStatusCode} with exception {args.RequestInformation.Exception.ToString()}");
};
DefaultOperationContext.RequestCompleted += (sender, args) =>
{
if (args.Response == null)
{
// This is occasionally executed - we want to retry in this case.
Debug.WriteLine($"Request failed in {this.GetType().Name} due to HTTP code {args.RequestInformation.HttpStatusCode} with exception {args.RequestInformation.Exception.ToString()}");
}
else
{
Debug.WriteLine($"{this.GetType().Name} operation complete: Status code {args.Response.StatusCode} at {args.Response.ResponseUri}");
}
};
Table.CreateIfNotExists(DefaultTableRequestOptions, DefaultOperationContext);
}
public async Task<FooEntity> GetFooEntityAsync()
{
var retrieveOperation = TableOperation.Retrieve<FooTableEntity>(FooTableEntity.GenerateKey());
var tableEntity = (await Table.ExecuteAsync(retrieveOperation, DefaultTableRequestOptions, DefaultOperationContext)).Result as FooTableEntity;
if (tableEntity != null)
{
return tableEntity.ToFooEntity();
}
var fooEntity = CalculateFooEntity();
var insertOperation = TableOperation.InsertOrMerge(new FooTableEntity(fooEntity));
var executeResult = await Table.ExecuteAsync(insertOperation);
if (executeResult.HttpStatusCode == 408)
{
// This is never executed.
Debug.WriteLine("Got a 408");
}
return fooEntity;
}
public class OnTimeoutRetry : IRetryPolicy
{
int maxRetryAttempts = 3;
TimeSpan defaultRetryInterval = TimeSpan.FromMilliseconds(250);
public OnTimeoutRetry(TimeSpan deltaBackoff, int retryAttempts)
{
maxRetryAttempts = retryAttempts;
defaultRetryInterval = deltaBackoff;
}
public IRetryPolicy CreateInstance()
{
return new OnTimeoutRetry(TimeSpan.FromMilliseconds(250), 3);
}
public bool ShouldRetry(int currentRetryCount, int statusCode, Exception lastException, out TimeSpan retryInterval, OperationContext operationContext)
{
retryInterval = defaultRetryInterval;
if (currentRetryCount >= maxRetryAttempts)
{
return false;
}
// Non-retryable exceptions are all 400 ( >=400 and <500) class exceptions (Bad gateway, Not Found, etc.) as well as 501 and 505.
// This custom retry policy also retries on a 408 timeout.
if ((statusCode >= 400 && statusCode <= 500 && statusCode != 408) || statusCode == 501 || statusCode == 505)
{
return false;
}
return true;
}
}
}
When calling GetFooEntityAsync(), occasionally the "Request failed" line will be executed. When inspecting the values args.RequestInformation.HttpStatusCode
= 408. However:
Debug.WriteLine("Got a 408");
within the GetFooEntity method is never executed.Debug.WriteLine($"Retry policy activated...
within theDefaultOperationContext.Retrying
delegate is never executed (I would expect this to be executed twice - is this not retrying?).DefaultOperationContext.RequestResults
contains a long list of results (mostly with status codes 404, some 204s).
According to this (rather old) blog post, exceptions with codes between 400 and 500, as well as 501 and 505 are non-retryable. However a timeout (408) is exactly the situation we would want a retry. Perhaps I need to write a custom retry policy for this case.
I don't fully understand where the 408 is coming from, as I can't find it in the code other than when the RequestCompleted delegate is invoked. I have been trying different settings for my retry policy without luck. What am I missing here? How can I get the operation to retry on a 408 from table storage?
EDIT: I have updated the code to show the custom retry policy that I implemented, to retry on 408 errors. However, it seems that my breakpoints on retry are still not being hit, so it appears the retry is not being triggered. What could be the reason my retry policy is not being activated?