I am using a third-party library that is non-async but can also either take longer than desired or occasionally completely block indefinitely (until cleared externally).
Represented for testing with this:
SlowWorkerResult SlowWorker(int i)
{
var delay = i % 2 == 0 ? TimeSpan.FromSeconds(2) : TimeSpan.FromSeconds(4);
Thread.Sleep(delay);
return new SlowWorkerResult();
}
class SlowWorkerResult
{
}
To handle these timeouts I wrap the call in a Task.Run
and apply an extension method I wrote to it:
static class Extensions
{
public static async Task<T> TimeoutAfter<T>(this Task<T> task, TimeSpan timeout)
{
var cts = new CancellationTokenSource();
var delayTask = Task.Delay(timeout);
var result = await Task.WhenAny(task, delayTask);
if (result == delayTask)
{
throw new TimeoutException();
}
cts.Cancel();
return await task;
}
}
This works reliably whenever it is run individually, i.e.
async Task<(bool, int)> BigWorker(int i)
{
try
{
Console.WriteLine($"BigWorker Started - {i}");
//do some async work
await Task.CompletedTask;
//do some non-async work using the timeout extension
var slowWorkerResult = await Task.Run(() => SlowWorker(i)).TimeoutAfter(TimeSpan.FromSeconds(3));
//do some more async work
await Task.CompletedTask;
return (true, i);
}
catch (Exception ex)
{
return (false, i);
}
finally
{
Console.WriteLine($"BigWorker Finished - {i}");
}
}
I am aware that this essentially abandons a thread. Barring support from the third-party library that isn't coming any time soon (if ever), I have no other way to protect against a deadlock.
However, when I run a BigWorker
in a parallel loop, I get unexpected results (namely that some sessions timeout when I would otherwise expect them to complete). For example, if I set totalWorkers
to 10, I get an even split of success/failure and the process takes about 3 seconds as expected.
async Task Main()
{
var sw = new Stopwatch();
sw.Start();
const int totalWorkers = 10;
var tasks = new ConcurrentBag<Task<(bool, int)>>();
Parallel.For(0, totalWorkers, i => tasks.Add(BigWorker(i)));
var results = await Task.WhenAll(tasks);
sw.Stop();
var success = results.Count(r => r.Item1);
var fails = results.Count(r => !r.Item1);
var elapsed = sw.Elapsed.ToString(@"ss\.ffff");
Console.WriteLine($"Successes: {success}\nFails: {fails}\nElapsed: {elapsed}");
}
Setting totalWorkers
to a larger number, say 100, generates an essentially random number of success/fails with the total time taking much longer.
I suspect this is due to task scheduling and threadpools, however I can't figure out what I would need to do to remedy it. I suspect a custom task scheduler that would somehow make sure my DoWork
wrapped task and my Task.Delay
are executed at the same time. Right now it appears that the Task.Delay
's are occasionally being started/completed before their corresponding DoWork
wrapped task.