20

Update: Well, now I've gone and done it: I filed a bug report with Microsoft about this, as I seriously doubt that it is correct behavior. That said, I'm still not 100% sure what to believe regarding this question; so I can see that what is "correct" is open to some level of interpretation.

My feeling is that either Microsoft will accept that this is a bug, or else respond that the modification of a mutable value type variable within a using statement constitutes undefined behavior.

Also, for what it's worth, I have at least a guess as to what is happening here. I suspect that the compiler is generating a class for the closure, "lifting" the local variable to an instance field of that class; and since it is within a using block, it's making the field readonly. As LukeH pointed out in a comment to the other question, this would prevent method calls such as MoveNext from modifying the field itself (they would instead affect a copy).


Note: I have shortened this question for readability, though it is still not exactly short. For the original (longer) question in its entirety, see the edit history.

I have read through what I believe are the relevant sections of the ECMA-334 and cannot seem to find a conclusive answer to this question. I will state the question first, then provide a link to some additional comments for those who are interested.

Question

If I have a mutable value type that implements IDisposable, I can (1) call a method that modifies the state of the local variable's value within a using statement and the code behaves as I expect. Once I capture the variable in question inside a closure within the using statement, however, (2) modifications to the value are no longer visible in the local scope.

This behavior is only apparent in the case where the variable is captured inside the closure and within a using statement; it is not apparent when only one (using) or the other condition (closure) is present.

Why does capturing a variable of a mutable value type inside a closure within a using statement change its local behavior?

Below are code examples illustrating items 1 and 2. Both examples will utilize the following demonstration Mutable value type:

struct Mutable : IDisposable
{
    int _value;
    public int Increment()
    {
        return _value++;
    }

    public void Dispose() { }
}

1. Mutating a value type variable within a using block

using (var x = new Mutable())
{
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The output code outputs:

0
1

2. Capturing a value type variable inside a closure within a using block

using (var x = new Mutable())
{
    // x is captured inside a closure.
    Func<int> closure = () => x.Increment();

    // Now the Increment method does not appear to affect the value
    // of local variable x.
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The above code outputs:

0
0

Further Comments

It has been noted that the Mono compiler provides the behavior I expect (changes to the value of the local variable are still visible in the using + closure case). Whether this behavior is correct or not is unclear to me.

For some more of my thoughts on this issue, see here.

Community
  • 1
  • 1
Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • Interestingly, this behavior does not exhibit itself on Mono, where iteration over all three elements happens just fine. – cdhowie Jan 10 '11 at 00:05
  • @cdhowie: That *is* interesting. Give me a minute to verify on my machine... – Dan Tao Jan 10 '11 at 00:08
  • The fact that mono gives the expected answer makes it seem like a bug in csc, but the expected answer might not be the correct one. For example, closing over the iteration variable in a `foreach` loop usually gives unexpected, if correct, results. – Gabe Jan 10 '11 at 07:50
  • @Gabe: That's true, but closing over the loop variable doesn't change the semantics of the `foreach` itself. – LukeH Jan 10 '11 at 12:10
  • LukeH: You're right; that was just an example of where the unexpected behavior is actually correct according to the specs. – Gabe Jan 10 '11 at 13:41
  • @Gabe: I definitely agree, and I'm *trying* to choose my words carefully so that I'm not saying one behavior is right or wrong (though clearly I personally *prefer* one). But what's so baffling to me is that the behavior csc generates could be correct when it is specific to the `using` plus closure case. It seems to me it should either behave one way, in the presence of a closure, consistently; or it should behave the other way consistently. As it stands, it is the inconsistency that bugs me. However, I'm willing to believe there may be an explanation that just isn't obvious to me... – Dan Tao Jan 10 '11 at 13:53
  • 1
    I'm surprised Eric Lippert hasn't chimed in on this. This question seems like the kind of thing few people aside from him can answer. – Gabe Jan 11 '11 at 06:43
  • @Gabe: I felt that perhaps the length and density of the question was turning off potential readers. In a somewhat feeble attempt to rectify that, I've rewritten it. Time will tell if I get any more bites, from Lippert or elsewhere! – Dan Tao Jan 11 '11 at 15:24
  • @Dan: Regarding your latest update... That sort-of-does/sort-of-doesn't describe what's actually happening: The field that represents the captured local in the compiler-generated class *is not* marked `readonly` but, interestingly, the IL that accesses the field within the `using` block is *exactly the same* as would be generated if the field was `readonly` (ie, copy the field into a local and mutate the local rather than mutating the field itself). – LukeH Jan 13 '11 at 01:47
  • 2
    @Gabe: I do eat and sleep you know. I don't see every question on this site! – Eric Lippert Jan 14 '11 at 06:59
  • 2
    @Eric: Sorry, I simply assumed you were another instance of the SkeetBot, just run with different parameters. – Gabe Jan 14 '11 at 08:13
  • 1
    I once read an advise from Jon Skeet: stay away from mutable structs, for they're only trouble. This question seems to further reinforce this opinion. – ShdNx Apr 29 '11 at 19:07

4 Answers4

11

This has to do with the way closure types are generated and used. There appears to be a subtle bug in the way csc uses these types. For example, here is the IL generated by Mono's gmcs when invoking MoveNext():

      IL_0051:  ldloc.3
      IL_0052:  ldflda valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Foo/'<Main>c__AnonStorey0'::enumerator
      IL_0057:  call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()

Note that it's loading the field's address, which allows the method call to modify the instance of the value type stored on the closure object. This is what I would consider to be correct behavior, and this results in the list contents being enumerated just fine.

Here's what csc generates:

      IL_0068:  ldloc.3
      IL_0069:  ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Tinker.Form1/'<>c__DisplayClass3'::enumerator
      IL_006e:  stloc.s 5
      IL_0070:  ldloca.s 5
      IL_0072:  call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()

So in this case it's taking a copy of the value type instance and invoking the method on the copy. It should be no surprise why this gets you nowhere. The get_Current() call is similarly wrong:

      IL_0052:  ldloc.3
      IL_0053:  ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Tinker.Form1/'<>c__DisplayClass3'::enumerator
      IL_0058:  stloc.s 5
      IL_005a:  ldloca.s 5
      IL_005c:  call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
      IL_0061:  call void class [mscorlib]System.Console::WriteLine(int32)

Since the state of the enumerator it's copying has not had MoveNext() called, get_Current() apparently returns default(int).

In short: csc appears to be buggy. It's interesting that Mono got this right while MS.NET did not!

...I'd love to hear Jon Skeet's comments on this particular oddity.


In a discussion with brajkovic in #mono, he determined that the C# language specification does not actually detail how the closure type should be implemented, nor how accesses of locals that are captured in the closure should get translated. An example implementation in the spec seems to use the "copy" method that csc uses. Therefore either compiler output can be considered correct according to the language specification, though I would argue that csc should at least copy the local back to the closure object after the method call.

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • The docs for `IEnumerator` specifically state that the behaviour of `Current` is undefined prior to the first `MoveNext` call. But you're right that the rest of it looks like a codegen bug in the MS compiler. – LukeH Jan 10 '11 at 00:24
  • Yeah, I find it a bit strange that the `Current` property doesn't throw an exception before enumeration for `List` as well (it actually *does* for other collections such as `Queue`), though @LukeH is right that this behavior is specifically documented as "undefined"; so MS did at least cover their backs there. The fact that the Mono compiler exhibits the behavior I would expect makes me uneasy. It certainly does *look* like a bug to me... – Dan Tao Jan 10 '11 at 00:28
  • @LukeH: Ah, good to know. I've always seen enumerators throw exceptions in that case, so I expected that to be part of the interface contract. Another reason to check the docs. :) – cdhowie Jan 10 '11 at 00:28
  • Personally this behavior seems correct based off of what I know of value types. What do I know though. – ChaosPandion Jan 10 '11 at 00:55
  • @ChaosPandion: You could technically call it correct either way. Using `ldflda`, Mono side-steps the copying issue in general by using the struct via its address on the closure object. So the fact that the code example works in Mono is evidence that it *can* be made to work, and this behavior is IMO correct since closures should treat uses of captured locals like they all refer to one variable. Using the value type by its closure-type-field address does exactly this, while copying the value locally is extremely unintuitive and causes very bad side effects. – cdhowie Jan 10 '11 at 01:02
  • 1
    @ChaosPandion: To clarify my previous comment, note that gmcs translates `ldloca` for captured variables to `ldflda` -- this seems like the correct way to go. csc instead translates `ldloca` to `ldfld; stloc; ldloca`. gmcs preserves the semantics of locals when they are captured; csc does not. – cdhowie Jan 10 '11 at 01:58
  • 1
    For anyone wishing to duplicate the research, the relevant section of ECMA-334 is §14.5.15.5, and the spec is freely available at http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf. The preceding sections (more generally, all of §14.5.15) are about anonymous methods, but the relevant section is .5. – Bojan Rajkovic Jan 10 '11 at 06:21
  • It's the strangest thing; I just discovered that this behavior **only** occurs *within a `using` block* when the variable is included in a closure. Within a `using` with no closure it works fine. With a closure *not* within a `using` it also works fine. It's only in the case of *both* that this happens. What do you think of this? I'm having trouble imagining how this could not be considered a bug. – Dan Tao Jan 10 '11 at 07:38
  • 1
    @Dan: Hmm, that's *really* interesting. It might be worth reporting this as a bug. – cdhowie Jan 10 '11 at 16:47
  • @cdhowie: [Done](https://connect.microsoft.com/VisualStudio/feedback/details/635745). – Dan Tao Jan 12 '11 at 18:29
  • 1
    It seems a little unfair... kind of Eric Lippert getting the checkmark just because only he could know for sure what he says (that it's a bug)... but since his answer has to be viewed as authoritative, I accepted it. This was a great answer too, though. – Dan Tao Jan 14 '11 at 13:38
  • The fundamental problem is that C# and .net lack 'const ref' parameters, and do not allow a struct's member functions to declare whether the argument should be accepted as 'ref' or 'const ref'. If one tries to pass a readonly struct instance as a 'ref', the compiler shouldn't make a copy--it should simply forbid the behavior. Fundamentally, if code generation would require making an implicit copy of a struct, the design of the code generator is broken. Unfortunately, given the lack of 'const ref', there isn't much choice but to horribly kludge call-by-tempvalue-ref. – supercat Jan 20 '12 at 23:54
7

It's a known bug; we discovered it a couple years ago. The fix would be potentially breaking, and the problem is pretty obscure; these are points against fixing it. Therefore it has never been prioritized high enough to actually fix it.

This has been in my queue of potential blog topics for a couple years now; perhaps I ought to write it up.

And incidentally, your conjecture as to the mechanism that explains the bug is completely accurate; nice psychic debugging there.

So, yes, known bug, but thanks for the report regardless!

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Blast! Thought I had discovered something new! – Dan Tao Jan 14 '11 at 07:08
  • @Dan: Nope, but great job analyzing it regardless. Incidentally, I seem to recall that there are similar bugs involving "lock". – Eric Lippert Jan 14 '11 at 07:23
  • @EricLippert: It seems there is the same issue with foreach loop variable as well - http://stackoverflow.com/questions/13610559/c-sharp-struct-instance-behavior-changes-when-captured-in-lambda/13624476 – Sergey Teplyakov Dec 06 '12 at 09:08
  • 1
    @SergeyTeplyakov: There is a very similar but not exactly identical issue with the foreach loop variable, yes. – Eric Lippert Dec 11 '12 at 16:40
0

EDIT - This is incorrect, I didn't read the question carefully enough.

Placing the struct into a closure causes an assignment. Assignments on value types result in a copy of the type. So what's happening is you are creating a new Enumerator<int>, and Current on that enumerator will return 0.

using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        List<int> l = new List<int>();
        Console.WriteLine(l.GetEnumerator().Current);
    }
}

Result: 0

Martin Doms
  • 8,598
  • 11
  • 43
  • 60
0

The problem is the enumerator is stored in another class so every action is working with a copy of the enumerator.

[CompilerGenerated]
private sealed class <>c__DisplayClass3
{
    // Fields
    public List<int>.Enumerator enumerator;

    // Methods
    public int <Main>b__1()
    {
        return this.enumerator.Current;
    }
}

public static void Main(string[] args)
{
    List<int> <>g__initLocal0 = new List<int>();
    <>g__initLocal0.Add(1);
    <>g__initLocal0.Add(2);
    <>g__initLocal0.Add(3);
    List<int> list = <>g__initLocal0;
    Func<int> CS$<>9__CachedAnonymousMethodDelegate2 = null;
    <>c__DisplayClass3 CS$<>8__locals4 = new <>c__DisplayClass3();
    CS$<>8__locals4.enumerator = list.GetEnumerator();
    try
    {
        if (CS$<>9__CachedAnonymousMethodDelegate2 == null)
        {
            CS$<>9__CachedAnonymousMethodDelegate2 = new Func<int>(CS$<>8__locals4.<Main>b__1);
        }
        while (CS$<>8__locals4.enumerator.MoveNext())
        {
            Console.WriteLine(CS$<>8__locals4.enumerator.Current);
        }
    }
    finally
    {
        CS$<>8__locals4.enumerator.Dispose();
    }
}

Without the lambda the code is closer to what you would expect.

public static void Main(string[] args)
{
    List<int> <>g__initLocal0 = new List<int>();
    <>g__initLocal0.Add(1);
    <>g__initLocal0.Add(2);
    <>g__initLocal0.Add(3);
    List<int> list = <>g__initLocal0;
    using (List<int>.Enumerator enumerator = list.GetEnumerator())
    {
        while (enumerator.MoveNext())
        {
            Console.WriteLine(enumerator.Current);
        }
    }
}

Specific IL

L_0058: ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Machete.Runtime.Environment/<>c__DisplayClass3::enumerator
L_005d: stloc.s CS$0$0001
L_005f: ldloca.s CS$0$0001
ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
  • 1
    The (presumably Reflector-generated) output there is flatly wrong. `while (CS$<>8__locals4.enumerator.MoveNext())` should work just fine. What actually happening is effectively `List.Enumerator foo; while ((foo = CS$<>8__locals4.enumerator), foo.MoveNext())` (though this is not syntactically valid, you get the point). In other words, instead of operating on the closure object's field by address -- which would work -- a local copy is being made. – cdhowie Jan 10 '11 at 00:19
  • @cdhowie - You are correct but it takes a copy of the field for each iteration. – ChaosPandion Jan 10 '11 at 00:34
  • @Chaos: That's exactly what I said. But the code generated by Reflector **does not** take a copy, and therefore is not an accurate picture of what's really going on. – cdhowie Jan 10 '11 at 00:35
  • 1
    @cdhowie is correct. If you take that Reflector-ed code and paste it back into your own source then it'll work correctly. (Obviously after renaming the compiler-generated identifiers to something legal first.) – LukeH Jan 10 '11 at 00:36
  • @cdhowie - The copy is the whole act of grabbing the field from the instance. What exactly are you trying to tell me? – ChaosPandion Jan 10 '11 at 00:41
  • 1
    @ChaosPandion: That the line of code `while (CS$<>8__locals4.enumerator.MoveNext())` as generated by Reflector is simply incorrect. If you compiled that code, it would emit a `ldflda; call;` sequence, not a `ldfld; stloc; ldloca; call;` sequence. – cdhowie Jan 10 '11 at 00:47