9

Why does the following asynchronous recursion fail with StackOverflowException, and why is it happening exactly at the last step, when the counter becomes zero?

static async Task<int> TestAsync(int c)
{
    if (c < 0)
        return c;

    Console.WriteLine(new { c, where = "before", Environment.CurrentManagedThreadId });

    await Task.Yield();

    Console.WriteLine(new { c, where = "after", Environment.CurrentManagedThreadId });

    return await TestAsync(c-1);
}

static void Main(string[] args)
{
    Task.Run(() => TestAsync(5000)).GetAwaiter().GetResult();
}

Output:

...
{ c = 10, where = before, CurrentManagedThreadId = 4 }
{ c = 10, where = after, CurrentManagedThreadId = 4 }
{ c = 9, where = before, CurrentManagedThreadId = 4 }
{ c = 9, where = after, CurrentManagedThreadId = 5 }
{ c = 8, where = before, CurrentManagedThreadId = 5 }
{ c = 8, where = after, CurrentManagedThreadId = 5 }
{ c = 7, where = before, CurrentManagedThreadId = 5 }
{ c = 7, where = after, CurrentManagedThreadId = 5 }
{ c = 6, where = before, CurrentManagedThreadId = 5 }
{ c = 6, where = after, CurrentManagedThreadId = 5 }
{ c = 5, where = before, CurrentManagedThreadId = 5 }
{ c = 5, where = after, CurrentManagedThreadId = 5 }
{ c = 4, where = before, CurrentManagedThreadId = 5 }
{ c = 4, where = after, CurrentManagedThreadId = 5 }
{ c = 3, where = before, CurrentManagedThreadId = 5 }
{ c = 3, where = after, CurrentManagedThreadId = 5 }
{ c = 2, where = before, CurrentManagedThreadId = 5 }
{ c = 2, where = after, CurrentManagedThreadId = 5 }
{ c = 1, where = before, CurrentManagedThreadId = 5 }
{ c = 1, where = after, CurrentManagedThreadId = 5 }
{ c = 0, where = before, CurrentManagedThreadId = 5 }
{ c = 0, where = after, CurrentManagedThreadId = 5 }

Process is terminated due to StackOverflowException.

I'm seeing this with .NET 4.6 installed. The project is a console app targeting .NET 4.5.

I understand that the continuation for Task.Yield may get scheduled by ThreadPool.QueueUserWorkItem on the same thread (like #5 above), in case the thread has been already released to the pool - right after await Task.Yield(), but before the QueueUserWorkItem callback has been actually scheduled.

I don't however understand why and where the stack is still deepening. The continuation shouldn't be happening on the same stack frame here, even if it's called on the same thread.

I took a step further and implemented a custom version of Yield which makes sure the continuation doesn't happen on the same thread:

public static class TaskExt
{
    public static YieldAwaiter Yield() { return new YieldAwaiter(); }

    public struct YieldAwaiter : System.Runtime.CompilerServices.ICriticalNotifyCompletion
    {
        public YieldAwaiter GetAwaiter() { return this; }

        public bool IsCompleted { get { return false; } }

        public void GetResult() { }

        public void UnsafeOnCompleted(Action continuation)
        {
            using (var mre = new ManualResetEvent(initialState: false))
            {
                ThreadPool.UnsafeQueueUserWorkItem(_ => 
                {
                    mre.Set();
                    continuation();
                }, null);

                mre.WaitOne();
            }
        }

        public void OnCompleted(Action continuation)
        {
            throw new NotImplementedException();
        }
    }
}

Now, while using TaskExt.Yield instead of Task.Yield, threads are flipping each time but the stack overflow is still there:

...
{ c = 10, where = before, CurrentManagedThreadId = 3 }
{ c = 10, where = after, CurrentManagedThreadId = 4 }
{ c = 9, where = before, CurrentManagedThreadId = 4 }
{ c = 9, where = after, CurrentManagedThreadId = 5 }
{ c = 8, where = before, CurrentManagedThreadId = 5 }
{ c = 8, where = after, CurrentManagedThreadId = 3 }
{ c = 7, where = before, CurrentManagedThreadId = 3 }
{ c = 7, where = after, CurrentManagedThreadId = 4 }
{ c = 6, where = before, CurrentManagedThreadId = 4 }
{ c = 6, where = after, CurrentManagedThreadId = 5 }
{ c = 5, where = before, CurrentManagedThreadId = 5 }
{ c = 5, where = after, CurrentManagedThreadId = 4 }
{ c = 4, where = before, CurrentManagedThreadId = 4 }
{ c = 4, where = after, CurrentManagedThreadId = 3 }
{ c = 3, where = before, CurrentManagedThreadId = 3 }
{ c = 3, where = after, CurrentManagedThreadId = 5 }
{ c = 2, where = before, CurrentManagedThreadId = 5 }
{ c = 2, where = after, CurrentManagedThreadId = 3 }
{ c = 1, where = before, CurrentManagedThreadId = 3 }
{ c = 1, where = after, CurrentManagedThreadId = 5 }
{ c = 0, where = before, CurrentManagedThreadId = 5 }
{ c = 0, where = after, CurrentManagedThreadId = 3 }

Process is terminated due to StackOverflowException.
noseratio
  • 59,932
  • 34
  • 208
  • 486

1 Answers1

8

TPL reentrancy strikes again:

Note, that the stack overflow happens at the end of the function after completion of all iterations. Increasing the iteration count does not change that. Lowering it to a small amount removes the stack overflow.

The stack overflow happens when completing the async state machine task of the method TestAsync. It does not happen on the "descend". It happens when backing out and completing all async method tasks.

Let's first reduce the count to 2000 to put less load on the debugger. Then, look at the call stack:

enter image description here

Certainly very repetitive and long. This is the right thread to look at. The crash happens at:

        var t = await TestAsync(c - 1);
        return t;

When the inner task t completes it causes execution of the rest of the outer TestAsync. This is just the return statement. The return completes the task that the outer TestAsync has produced. This again triggers completion of another t and so on.

The TPL inlines some task continuations as a performance optimization. This behavior has caused a lot of grief already as evidenced by Stack Overflow questions. It has been requested to remove it. The issue is quite old and has not received any response so far. This does not inspire hope that we might eventually get rid of TPL reentrancy problems.

The TPL has some stack depth checks to turn off inlining of continuations when the stack becomes too deep. This is not being done here for reasons (yet) unknown to me. Note, that nowhere on the stack there is a TaskCompletionSource. TaskAwaiter makes use of internal functions in the TPL in order to increase performance. Maybe that optimized code path does not perform stack depth checks. Maybe this is a bug in that sense.

I don't think calling Yield has anything to do with the problem but it's good to put it in here to ensure non-synchronous completion of TestAsync.


Let's write the async state machine manually:

static Task<int> TestAsync(int c)
{
    var tcs = new TaskCompletionSource<int>();

    if (c < 0)
        tcs.SetResult(0);
    else
    {
        Task.Run(() =>
        {
            var t = TestAsync(c - 1);
            t.ContinueWith(_ => tcs.SetResult(0), TaskContinuationOptions.ExecuteSynchronously);
        });
    }

    return tcs.Task;
}

static void Main(string[] args)
{
    Task.Run(() => TestAsync(2000).ContinueWith(_ =>
    {
          //breakpoint here - look at the stack
    }, TaskContinuationOptions.ExecuteSynchronously)).GetAwaiter().GetResult();
}

Thanks to TaskContinuationOptions.ExecuteSynchronously we also expect continuation inlining to happen. It does, but it does not overflow the stack:

enter image description here

That's because the TPL prevents the stack from becoming too deep (as explained above). This mechanism seems to not be present when completing an async method task.

If ExecuteSynchronously is removed then the stack is shallow and no inlining happens. await runs with ExecuteSynchronously enabled.

Community
  • 1
  • 1
usr
  • 168,620
  • 35
  • 240
  • 369
  • An excellent answer. In fact the question was inspired by a customer awaiter (let's call it `AlwaysAsync`) which I experiment with to solve the exact same TPL problem you've described here. I use it inside `TestAsync` but not on its return line. So I've just changed the return line to be `return await TestAsync(c-1).AlwaysAsync()` and the problem has gone :) – noseratio Sep 05 '15 at 12:41
  • One other way of eliminating the stack dive here is using `Task.Run`: `async Task TestAsync(int c) { if (c < 0) return c; return await Task.Run(() => TestAsync(c - 1)); }`. – noseratio Sep 05 '15 at 22:14
  • @Noseratio does that work in practice? I think the completion might be inlined all the way through. Task.Run has special perf-optimized unwrapping code inside. Maybe here the stack overflow avoidance mechanism is in place and working. – usr Sep 05 '15 at 22:17
  • 1
    Actually, it seems this might be on its way to being fixed, see [the issue *Consider adding stack guard support to all Task continuation* on CoreCLR's GitHub](https://github.com/dotnet/coreclr/issues/1191). – svick Sep 05 '15 at 22:35
  • @usr, `Task.Run` does work regardless of the number of iterations. I myself don't know exactly why yet. Perhaps you're right about the `Task.Unwrap` which might include that check which avoids inlining. – noseratio Sep 05 '15 at 23:02