33

I have a fragile grasp of how the await keyword works, and I want to extend my understanding of it a bit.

The issue that still makes my head spin is the use of recursion. Here's an example:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TestingAwaitOverflow
{
    class Program
    {
        static void Main(string[] args)
        {
            var task = TestAsync(0);
            System.Threading.Thread.Sleep(100000);
        }

        static async Task TestAsync(int count)
        {
            Console.WriteLine(count);
            await TestAsync(count + 1);
        }
    }
}

This one obviously throws a StackOverflowException.

My understanding is because the code actually runs synchronously until the first asynchronous action, after which it returns a Task object that contains information about the asynchronous operation. In this case, there is no asynchronous operation, thus it just keeps recursing under the false promise that it will eventually get a Task returned.

Now changing it just a tiny bit:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TestingAwaitOverflow
{
    class Program
    {
        static void Main(string[] args)
        {
            var task = TestAsync(0);
            System.Threading.Thread.Sleep(100000);
        }

        static async Task TestAsync(int count)
        {
            await Task.Run(() => Console.WriteLine(count));
            await TestAsync(count + 1);
        }
    }
}

This one does not throw a StackOverflowException. I can sortof see why it works, but I would call it more of a gut feeling (it probably deals with how the code is arranged to use callbacks to avoid building the stack, but I can't translate that gut feeling into an explanation)

So I have two questions:

  • How does the second batch of code avoid a StackOverflowException?
  • Does the second batch of code waste other resources? (for example does it allocate an absurdly large number of Task objects on the heap?)

Thanks!

riwalk
  • 14,033
  • 6
  • 51
  • 68

2 Answers2

18

The part up to the first await in any function runs synchronously. In the first case it runs into a stack overflow because of that - there is nothing interrupting the function calling itself.

The first await (which does not complete immediately - this is the case for you with high likelyhood) causes the function to return (and to give up its stack space!). It queues the rest of it as a continuation. The TPL ensures that continuations never nest too deeply. If there is a risk of stack overflow the continuation is queued to the thread pool, resetting the stack (which was starting to fill up).

The second example can still overflow! What if the Task.Run task always completed immediately? (This is unlikely but possible with the right OS thread scheduling). Then, the async function would never be interrupted (causing it to return and free all stack space) and the same behavior as in case 1 would result.

usr
  • 168,620
  • 35
  • 240
  • 369
  • So if I replaced the `Task.Run()` with a function that immediately returned a completed `Task` object, it would reintroduce the stack overflow exception? (Or did I just propose something that is not possible?) – riwalk Dec 10 '12 at 20:01
  • @Stargazer712 Not only is it possible, it would indeed do what you're describing. Just try `await Task.FromResult(null);` This is why you should avoid a single function calling `await` a lot in a short span of time; you should ensure awaited tasks are either fairly long in operation, or that there is a small finite number of them such that running them synchronously is okay. – Servy Dec 10 '12 at 20:02
  • 5
    @Stargazer712 yes! And if you replaced it with `Task.Yield()` (which is guaranteed to require posting a continuation) you would get guaranteed freedom from stack overflows (at a performance cost). Note, that `Yield` returns an awaitable other than `Task`. It is much harder to get this guarantee from a task because you would have to ensure that it is not completed when queried by the await feature. – usr Dec 10 '12 at 20:04
0

In your first and second example the TestAsync is still waiting for the call to itself to return. The difference is the recursion is printing and returning the thread to other work in the second method. Therefore the recursion isn't fast enough to be a stack overflow. However, the first task is still waiting and eventually count will reach it's max integer size or stack overflow will be thrown again. The point is the calling thread is returned to but the actual async method is scheduled on the same thread. Basically, the TestAsync method is forgotten until await is complete but it is still held in memory. The thread is allowed to do other things until await completes and then that thread is remembered and finishes where await left off. Additional await calls store the thread and forget it again until await is again complete. Until all awaits are completed and the method therefore completes the TaskAsync is still in memory. So, here's the thing. If I tell a method to do something and then call await for a task. The rest of my codes elsewhere continues running. When await is complete the code picks back up there and finishes and then goes back to what it was doing at that time right before it. In your examples your TaskAsync is always in a tombstoned state (so to speak) until the last call complete and returns the calls back up the chain.

EDIT: I kept saying store the thread or that thread and I meant routine. They are all on the same thread which is the main thread in your example. Sorry if I confused you.

Michael
  • 45
  • 2