0

I am using a third-party library that is non-async but can also either take longer than desired or occasionally completely block indefinitely (until cleared externally).

Represented for testing with this:

SlowWorkerResult SlowWorker(int i)
{
    var delay = i % 2 == 0 ? TimeSpan.FromSeconds(2) : TimeSpan.FromSeconds(4);
    Thread.Sleep(delay);
    return new SlowWorkerResult();
}

class SlowWorkerResult 
{
    
}

To handle these timeouts I wrap the call in a Task.Run and apply an extension method I wrote to it:

static class Extensions
{
    public static async Task<T> TimeoutAfter<T>(this Task<T> task, TimeSpan timeout)
    {
        var cts = new CancellationTokenSource();
        var delayTask = Task.Delay(timeout);
        var result = await Task.WhenAny(task, delayTask);
        if (result == delayTask)
        {
            throw new TimeoutException();
        }
        cts.Cancel();
        return await task;
    }
}

This works reliably whenever it is run individually, i.e.

async Task<(bool, int)> BigWorker(int i)
{
    try
    {
        Console.WriteLine($"BigWorker Started - {i}");
        
        //do some async work
        await Task.CompletedTask;

        //do some non-async work using the timeout extension
        var slowWorkerResult = await Task.Run(() => SlowWorker(i)).TimeoutAfter(TimeSpan.FromSeconds(3));

        //do some more async work
        await Task.CompletedTask;
        
        return (true, i);
    }
    catch (Exception ex)
    {
        return (false, i);
    }
    finally
    {
        Console.WriteLine($"BigWorker Finished - {i}");
    }
}

I am aware that this essentially abandons a thread. Barring support from the third-party library that isn't coming any time soon (if ever), I have no other way to protect against a deadlock.

However, when I run a BigWorker in a parallel loop, I get unexpected results (namely that some sessions timeout when I would otherwise expect them to complete). For example, if I set totalWorkers to 10, I get an even split of success/failure and the process takes about 3 seconds as expected.

async Task Main()
{
    var sw = new Stopwatch();
    sw.Start();
    const int totalWorkers = 10;
    
    var tasks = new ConcurrentBag<Task<(bool, int)>>();
    Parallel.For(0, totalWorkers, i => tasks.Add(BigWorker(i)));
            
    var results = await Task.WhenAll(tasks);
    sw.Stop();
    
    var success = results.Count(r => r.Item1);
    var fails = results.Count(r => !r.Item1);
    var elapsed = sw.Elapsed.ToString(@"ss\.ffff");

    Console.WriteLine($"Successes: {success}\nFails: {fails}\nElapsed: {elapsed}");
}

Setting totalWorkers to a larger number, say 100, generates an essentially random number of success/fails with the total time taking much longer.

I suspect this is due to task scheduling and threadpools, however I can't figure out what I would need to do to remedy it. I suspect a custom task scheduler that would somehow make sure my DoWork wrapped task and my Task.Delay are executed at the same time. Right now it appears that the Task.Delay's are occasionally being started/completed before their corresponding DoWork wrapped task.

gilliduck
  • 2,762
  • 2
  • 16
  • 32
  • What .NET platform are you targeting? .NET Core and later or .NET Framework? – Theodor Zoulias Feb 01 '23 at 15:35
  • 1
    Added tag, targeting .Net Core 3.1 – gilliduck Feb 01 '23 at 15:59
  • These questions might be relevant: [.NET Core equivalent to Thread.Abort](https://stackoverflow.com/questions/53465551/net-core-equivalent-to-thread-abort) and also [Thread or Task (stopping a hung single line of code)](https://stackoverflow.com/questions/75087774/thread-or-task-stopping-a-hung-single-line-of-code). The second is related to the .NET Framework, but the `RunInterruptible` method runs on the .NET Core as well. The `RunAbortable` is not. – Theodor Zoulias Feb 01 '23 at 16:03

1 Answers1

2

I suspect this is due to task scheduling and threadpools

Yes; specifically, the thread pool has a limited injection rate for new threads. If you need to pile on a bunch of synchronous work, then you should increase this threshold, which will cause the thread pool to quickly inject up to that threshold and switch to the limited injection rate past that threshold.

Alternatively, you could do the timeout from within the Task.Run, but that's pretty complex.

Side notes:

  • The CancellationTaskSource in TimeoutAfter doesn't do anything.
  • TimeoutAfter can be replaced with WaitAsync in .NET 6 and newer.
  • The Parallel.For isn't doing anything useful; it's just parallelizing the adding of tasks to the concurrent collection (and the tiny amount of code at the beginning of BigWorker).
Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810