198

Yesterday I was giving a talk about the new C# "async" feature, in particular delving into what the generated code looked like, and the GetAwaiter() / BeginAwait() / EndAwait() calls.

We looked in some detail at the state machine generated by the C# compiler, and there were two aspects we couldn't understand:

  • Why the generated class contains a Dispose() method and a $__disposing variable, which never appear to be used (and the class doesn't implement IDisposable).
  • Why the internal state variable is set to 0 before any call to EndAwait(), when 0 normally appears to mean "this is the initial entry point".

I suspect the first point could be answered by doing something more interesting within the async method, although if anyone has any further information I'd be glad to hear it. This question is more about the second point, however.

Here's a very simple piece of sample code:

using System.Threading.Tasks;

class Test
{
    static async Task<int> Sum(Task<int> t1, Task<int> t2)
    {
        return await t1 + await t2;
    }
}

... and here's the code which gets generated for the MoveNext() method which implements the state machine. This is copied directly from Reflector - I haven't fixed up the unspeakable variable names:

public void MoveNext()
{
    try
    {
        this.$__doFinallyBodies = true;
        switch (this.<>1__state)
        {
            case 1:
                break;

            case 2:
                goto Label_00DA;

            case -1:
                return;

            default:
                this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
                this.<>1__state = 1;
                this.$__doFinallyBodies = false;
                if (this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate))
                {
                    return;
                }
                this.$__doFinallyBodies = true;
                break;
        }
        this.<>1__state = 0;
        this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
        this.<a2>t__$await4 = this.t2.GetAwaiter<int>();
        this.<>1__state = 2;
        this.$__doFinallyBodies = false;
        if (this.<a2>t__$await4.BeginAwait(this.MoveNextDelegate))
        {
            return;
        }
        this.$__doFinallyBodies = true;
    Label_00DA:
        this.<>1__state = 0;
        this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();
        this.<>1__state = -1;
        this.$builder.SetResult(this.<1>t__$await1 + this.<2>t__$await3);
    }
    catch (Exception exception)
    {
        this.<>1__state = -1;
        this.$builder.SetException(exception);
    }
}

It's long, but the important lines for this question are these:

// End of awaiting t1
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

// End of awaiting t2
this.<>1__state = 0;
this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();

In both cases the state is changed again afterwards before it's next obviously observed... so why set it to 0 at all? If MoveNext() were called again at this point (either directly or via Dispose) it would effectively start the async method again, which would be wholly inappropriate as far as I can tell... if and MoveNext() isn't called, the change in state is irrelevant.

Is this simply a side-effect of the compiler reusing iterator block generation code for async, where it may have a more obvious explanation?

Important disclaimer

Obviously this is just a CTP compiler. I fully expect things to change before the final release - and possibly even before the next CTP release. This question is in no way trying to claim this is a flaw in the C# compiler or anything like that. I'm just trying to work out whether there's a subtle reason for this that I've missed :)

Doctor Jones
  • 21,196
  • 13
  • 77
  • 99
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 7
    The VB compiler produces a similar state machine (don't know if that's expected or not, but VB didn't have iterator blocks before) – Damien_The_Unbeliever Feb 17 '11 at 11:24
  • @Damien: Thanks, that's interesting to hear. I have no idea to what extent the two compilers share code. – Jon Skeet Feb 17 '11 at 11:26
  • @Jon - actually, I just checked again. They're similar, but not identical, so I don't think they're sharing code. – Damien_The_Unbeliever Feb 17 '11 at 11:27
  • What's the inner workings of MoveNextDelegate? – Rune FS Feb 17 '11 at 14:52
  • 1
    @Rune: MoveNextDelegate is just a delegate field which refers to MoveNext. It's cached to avoid creating a new Action to pass into the awaiter each time, I believe. – Jon Skeet Feb 17 '11 at 14:55
  • If t1 is already complete and it enters the default case again, will BeginAwait return immediately with false, thereby making the MoveNext call innocuous? – Jeff Yates Feb 17 '11 at 15:00
  • 5
    I think the answer is: this is a CTP. The high order bit for the team was getting this out there and the language design validated. And they did so amazingly quickly. You should expect the shipped implementation (of the compilers, not MoveNext) to differ significantly. I think Eric or Lucian to come back with an answer along the lines that there's nothing deep here, just a behavior/bug that doesn't matter in most cases and no one noticed. Because it's a CTP. – Chris Burrows Feb 17 '11 at 16:45
  • 1
    @Chris: That sounds entirely likely. Btw, you may want to update your Stack Overflow user profile :) – Jon Skeet Feb 17 '11 at 16:52
  • Have you checked the actual IL code? I wonder if this can be a bug in Reflector. – Stilgar Feb 18 '11 at 11:33
  • 2
    @Stilgar: I've just checked with ildasm, and it really is doing this. – Jon Skeet Feb 18 '11 at 11:36
  • What exactly is in the awaiter? Is it possible that EndAwait somehow observes the state of the field? – Stilgar Feb 18 '11 at 11:48
  • 1
    @Chris: Aside: your rep is currently (8:55AM EST, 2/18/2011) 666. – Jeff Yates Feb 18 '11 at 13:54
  • 1
    @Stilgar: The generated code shouldn't know or care about the awaiter. The spec says what the awaiter should do in terms of the callers to BeginAwait and EndAwait - it can't snoop on the state. – Jon Skeet Feb 18 '11 at 13:56
  • 3
    @JonSkeet: Notice how no one upvotes the answers. 99% of us can't really tell if the answer even sounds about right. – the_drow Mar 04 '11 at 04:46
  • 1
    @the_drow: True. I'm not sufficiently convinced by any of them at the moment... Chris's explanation is the only one I believe at the minute. It'll be interesting to see what happens when the next CTP/beta arrives. – Jon Skeet Mar 04 '11 at 06:23
  • 1
    I am summoning Eric Lippert as strongly as I can. –  Mar 04 '11 at 06:44
  • @Inuyasha: I pinged Eric when I wrote the question. He's a very busy man, and realistically this isn't an *important* question in terms of how async is used. – Jon Skeet Mar 04 '11 at 07:05
  • If there is a good reason for this to happen it is probably because of how this kind of code ends up in machine code, however, in this particular case I can think of nothing that would explain this. The answer to this question, therefore, must be: It's an inefficiency in the compiler that should have been filtered out, because it serves no purpose, but has a very low impact on eventual performance and/or code size. – Wouter Simons Mar 10 '11 at 12:41
  • @Wouter: I'm not worried about the performance so much as the *correctness*. If there's any "intermediate" state, it should be a separate number meaning, "it's not appropriate to be called back at this point" as distinct from "I'm at the start of the method". We'll see what happens when we get the CTP refresh though :) – Jon Skeet Mar 10 '11 at 15:41
  • @Damien: Lucian Wischik told me that the VB and C# compilers do in fact share most of the code for this. @Jon: You were there - don't you remember? He also asked if the C# iterator block code should be changed to use that new code... – configurator Mar 12 '11 at 18:44
  • @configurator: I was there when we were discussing it, but I'm not sure I heard that... at least not for the async CTP. I thought that was the plan for the *new* (managed) compiler. – Jon Skeet Mar 12 '11 at 19:06

4 Answers4

71

Okay, I finally have a real answer. I sort of worked it out on my own, but only after Lucian Wischik from the VB part of the team confirmed that there really is a good reason for it. Many thanks to him - and please visit his blog (on archive.org), which rocks.

The value 0 here is only special because it's not a valid state which you might be in just before the await in a normal case. In particular, it's not a state which the state machine may end up testing for elsewhere. I believe that using any non-positive value would work just as well: -1 isn't used for this as it's logically incorrect, as -1 normally means "finished". I could argue that we're giving an extra meaning to state 0 at the moment, but ultimately it doesn't really matter. The point of this question was finding out why the state is being set at all.

The value is relevant if the await ends in an exception which is caught. We can end up coming back to the same await statement again, but we mustn't be in the state meaning "I'm just about to come back from that await" as otherwise all kinds of code would be skipped. It's simplest to show this with an example. Note that I'm now using the second CTP, so the generated code is slightly different to that in the question.

Here's the async method:

static async Task<int> FooAsync()
{
    var t = new SimpleAwaitable();
    
    for (int i = 0; i < 3; i++)
    {
        try
        {
            Console.WriteLine("In Try");
            return await t;
        }                
        catch (Exception)
        {
            Console.WriteLine("Trying again...");
        }
    }
    return 0;
}

Conceptually, the SimpleAwaitable can be any awaitable - maybe a task, maybe something else. For the purposes of my tests, it always returns false for IsCompleted, and throws an exception in GetResult.

Here's the generated code for MoveNext:

public void MoveNext()
{
    int returnValue;
    try
    {
        int num3 = state;
        if (num3 == 1)
        {
            goto Label_ContinuationPoint;
        }
        if (state == -1)
        {
            return;
        }
        t = new SimpleAwaitable();
        i = 0;
      Label_ContinuationPoint:
        while (i < 3)
        {
            // Label_ContinuationPoint: should be here
            try
            {
                num3 = state;
                if (num3 != 1)
                {
                    Console.WriteLine("In Try");
                    awaiter = t.GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        state = 1;
                        awaiter.OnCompleted(MoveNextDelegate);
                        return;
                    }
                }
                else
                {
                    state = 0;
                }
                int result = awaiter.GetResult();
                awaiter = null;
                returnValue = result;
                goto Label_ReturnStatement;
            }
            catch (Exception)
            {
                Console.WriteLine("Trying again...");
            }
            i++;
        }
        returnValue = 0;
    }
    catch (Exception exception)
    {
        state = -1;
        Builder.SetException(exception);
        return;
    }
  Label_ReturnStatement:
    state = -1;
    Builder.SetResult(returnValue);
}

I had to move Label_ContinuationPoint to make it valid code - otherwise it's not in the scope of the goto statement - but that doesn't affect the answer.

Think about what happens when GetResult throws its exception. We'll go through the catch block, increment i, and then loop round again (assuming i is still less than 3). We're still in whatever state we were before the GetResult call... but when we get inside the try block we must print "In Try" and call GetAwaiter again... and we'll only do that if state isn't 1. Without the state = 0 assignment, it will use the existing awaiter and skip the Console.WriteLine call.

It's a fairly tortuous bit of code to work through, but that just goes to show the kinds of thing that the team has to think about. I'm glad I'm not responsible for implementing this :)

Igor
  • 349
  • 1
  • 6
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 9
    @Shekhar_Pro: Yes, it's a goto. You should *expect* to see plenty of goto statements in auto-generated state machines :) – Jon Skeet Apr 23 '11 at 07:54
  • 12
    @Shekhar_Pro: Within manually written code, it is - because it makes the code hard to read and follow. No-one reads autogenerated code though, except fools like me who decompile it :) – Jon Skeet Apr 23 '11 at 08:11
  • So what _does_ happen when we await again after an exception? We start all over again? – configurator Apr 23 '11 at 22:20
  • 1
    @configurator: It calls GetAwaiter on the awaitable, which is what I'd expect it to do. – Jon Skeet May 06 '11 at 14:11
  • gotos don't always make code harder to read. In fact, sometimes they even make sense to use (sacrilege to say, I know). For example, sometimes you may need to break multiple nested loops. I lesser used feature of goto (and IMO uglier use) is to cause switch statements to cascade. On a seperate note, I remember a day and age when gotos were a primary underpinning of some programming languages and because of that I fully realize why the mere mention of goto makes developers shudder. They can make things ugly if used poorly. – Ben Lesh Mar 13 '12 at 03:38
5

if it was kept at 1 (first case) you would get a call to EndAwait without a call to BeginAwait. If it's kept at 2 (second case) you'd get the same result just on the other awaiter.

I'm guessing that calling the BeginAwait returns false if it has be started already (a guess from my side) and keeps the original value to return at the EndAwait. If that's the case it would work correctly whereas if you set it to -1 you might have an uninitialized this.<1>t__$await1 for the first case.

This however assumes that BeginAwaiter won't actually start the action on any calls after the first and that it will return false in those cases. Starting would of course be unacceptable since it could have side effect or simply give a different result. It also assumpes that the EndAwaiter will always return the same value no matter how many times it's called and that is can be called when BeginAwait returns false (as per the above assumption)

It would seem to be a guard against race conditions If we inline the statements where movenext is called by a different thread after the state = 0 in questions it woule look something like the below

this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.<>1__state = 0;

//second thread
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.$__doFinallyBodies = true;
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

//other thread
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

If the assumptions above are correct the there's some unneeded work done such as get sawiater and reassigning the same value to <1>t__$await1. If the state was kept at 1 then the last part would in stead be:

//second thread
//I suppose this un matched call to EndAwait will fail
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

further if it was set to 2 the state machine would assume it already had gotten the value of the first action which would be untrue and a (potentially) unassigned variable would be used to calculate the result

Rob
  • 45,296
  • 24
  • 122
  • 150
Rune FS
  • 21,497
  • 7
  • 62
  • 96
  • Bear in mind that the state isn't actually being *used* between the assignment to 0 and the assignment to a more meaningful value. If it's meant to guard against race conditions, I'd expect some other value to indicate that, e.g. -2, with a check for that at the start of MoveNext to detect inappropriate use. Bear in mind that a single instance should never actually be used by two threads at a time anyway - it's meant to give the illusion of a single synchronous method call which manages to "pause" every so often. – Jon Skeet Feb 17 '11 at 15:44
  • @Jon I agree it shouldn't be an issue with a race condition in the async case but could be in iteration block and could be a left over – Rune FS Feb 17 '11 at 15:57
  • @Tony: I think I'll wait until the next CTP or beta comes out, and check that behaviour. – Jon Skeet Mar 03 '11 at 06:29
1

Could it be something to do with stacked/nested async calls ?..

i.e:

async Task m1()
{
    await m2;
}

async Task m2()
{
    await m3();
}

async Task m3()
{
Thread.Sleep(10000);
}

Does the movenext delegate get called multiple times in this situation ?

Just a punt really?

GaryMcAllister
  • 141
  • 1
  • 3
  • There would be three different generated classes in that case. `MoveNext()` would be called once on each of them. – Jon Skeet Feb 17 '11 at 14:45
0

Explanation of actual states:

possible states:

  • 0 Initialized (i think so) or waiting for end of operation
  • >0 just called MoveNext, chosing next state
  • -1 ended

Is it possible that this implementation just wants to assure that if another Call to MoveNext from whereever happens (while waiting) it will reevaluate the whole state-chain again from the beginning, to reevaluate results which could be in the mean time already outdated?

fixagon
  • 5,506
  • 22
  • 26
  • But why would it want to start from the beginning? That's almost certainly *not* what you'd actually want to happen - you'd want an exception thrown, because nothing else *should* be calling MoveNext. – Jon Skeet Mar 03 '11 at 06:28