2

Apologies if this question has been asked already, but suppose we have this code (I've run it with Mono 2.10.2 and compiled with gmcs 2.10.2.0):

using System;

public class App {
    public static void Main(string[] args) {
        Func<string> f = null;
        var strs = new string[]{
            "foo",
            "bar",
            "zar"
        };

        foreach (var str in strs) {
            if ("foo".Equals(str)) 
                f = () => str;
        }
        Console.WriteLine(f());     // [1]: Prints 'zar'

        foreach (var str in strs) {
            var localStr = str;
            if ("foo".Equals(str))
                f = () => localStr;
        }
        Console.WriteLine(f());     // [2]: Prints 'foo'

        { int i = 0;
        for (string str; i < strs.Length; ++i) {
            str = strs[i];
            if ("foo".Equals(str)) 
                f = () => str;
        }}
        Console.WriteLine(f());     // [3]: Prints 'zar'
    }
}

It seems logical that [1] print the same as [3]. But to be honest, I somehow expected it to print the same as [2]. I somehow believed the implementation of [1] would be closer to [2].

Question: Could anyone please provide a reference to the specification where it tells exactly how the str variable (or perhaps even the iterator) is captured by the lambda in [1].

I guess what I am looking for is the exact implementation of the foreach loop.

sinharaj
  • 1,093
  • 2
  • 11
  • 16

4 Answers4

11

You asked for a reference to the specification; the relevant location is section 8.8.4, which states that a "foreach" loop is equivalent to:

    V v;
    while (e.MoveNext()) {
        v = (V)(T)e.Current;
        embedded-statement
    }

Note that the value v is declared outside the while loop, and therefore there is a single loop variable. That is then closed over by the lambda.

UPDATE

Because so many people run into this problem the C# design and compiler team changed C# 5 to have these semantics:

    while (e.MoveNext()) {
        V v = (V)(T)e.Current;
        embedded-statement
    }

Which then has the expected behaviour -- you close over a different variable every time. Technically that is a breaking change, but the number of people who depend on the weird behaviour you are experiencing is hopefully very small.

Be aware that C# 2, 3, and 4 are now incompatible with C# 5 in this regard. Also note that the change only applies to foreach, not to for loops.

See http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/ for details.


Commenter abergmeier states:

C# is the only language that has this strange behavior.

This statement is categorically false. Consider the following JavaScript:

var funcs = [];
var results = [];
for(prop in { a : 10, b : 20 })
{
  funcs.push(function() { return prop; });
  results.push(funcs[0]());
}

abergmeier, would you care to take a guess as to what are the contents of results?

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • The first implementation is consistent with the usual scoping of variables declared in the head of a `for` loop. But the new implementation tries to accommodate for the fact that humans expect and depend upon the inconsistent behaviour (which can lead to hard to discover bugs--it happened to me). Well, to be honest, I have mixed feelings about this decision... – sinharaj Aug 19 '11 at 16:11
  • @sinharaj: I also have mixed feelings. Consistency is nice to have because it helps your intuition. But this is a case where consistency is working *against* people's intuition of what the expected behaviour is. People don't think of the loop variable of the foreach as being a *variable*, they think of it as being a *value*. (And you can't treat it as a variable; you can't write to it yourself, for instance.) I don't want to have a "foolish consistency" that causes more harm than it prevents. – Eric Lippert Aug 19 '11 at 16:57
  • @Eric: I vote for the change. IMHO it is counter intuitive, the way it is now. The lambda that captures the variable is declared inside the loop, so people expect it to have the value at that point in time. However, this is just one installment of a whole bunch of related problems. `var i=0; Action a = () => Debug.Write(i); i = 2; a();` Here the same counter intuitive change of i happens. I don't know if it is good, to change this behavior in one place, but leave it be at another. – Daniel Hilgarth Aug 19 '11 at 17:19
  • 1
    @Daniel: But think about it this way: `var city = "London"; var q = from c in customers where c.City == city select c; Console.WriteLine(q.First()); city = "Manchester"; Console.WriteLine(q.First());` -- one expects that this will give two different results. The lambda is closed over the variable, not the value, and it should observe the most recent state of the variable. – Eric Lippert Aug 19 '11 at 17:48
  • @Eric: Does one really expect this? I am not so sure about this. Not if you don't know that the query uses deferred execution. And even if: Once it has been executed, why should it suddenly change its result? I sure found things like this strange when I started using LINQ and lambdas. – Daniel Hilgarth Aug 19 '11 at 17:54
  • @Daniel: Well, suppose you had `class C { public string city; string M() { return this.city; } } ... C c = new C(); c.city = "London"; Console.WriteLine(c.M()); c.city = "Manchester"; Console.WriteLine(c.M());` -- there you'd expect the "result" of the "query" that is M() to vary as the variable mutates. A lambda is nothing more than a more convenient syntax for writing a method like M on a class like C. – Eric Lippert Aug 19 '11 at 18:16
  • @Eric: I know what a lambda is and I know that the current way lambdas work is consistent. But it is not obvious that the lambda is what it is. All I am saying is: Mutating the query by changing a variable that was captured by the closure is not intuitive. – Daniel Hilgarth Aug 19 '11 at 19:24
  • @Eric Then the question becomes "Is the fact that lamdbas capture variables by hoisting them into a class expected?". If you understand the feature, then sure. Intuitively, though, I'm not sure that's the case. For example, when you write `int i = 10; int j = i; i = 15;`, I think most people understand that `j` is still 10 (though screwing up value-type semantics for non-basic types is whole other issue.) But when you say `int i = 10; Func a=() => i; i = 15;`, the semantics have changed, even though it could appear to someone who doesn't know what's happening that they would be the same. – dlev Aug 19 '11 at 19:26
  • @Eric (cont'd) I think intuitively, people see the variable in the lambda, and assume that the expression is evaluated *right now*. Obviously, that's not the case, but I can see how just seeing a variable sitting there on a line does suggest that interpretation. – dlev Aug 19 '11 at 19:29
  • Unrelated: Please make the change you describe. – dlev Aug 19 '11 at 19:30
  • @Eric: please make this change. I think most people think of `foreach` as giving each item in succession, and not "assigning each item to the loop variable in succession". – harold Aug 20 '11 at 06:59
  • @harold Besides C# is the only language that has this strange behavior. Really annoying. – abergmeier Aug 02 '13 at 14:17
  • @abergmeier: Your first statement is categorically false. – Eric Lippert Aug 02 '13 at 15:19
2

The core difference between 1 / 3 and 2 is the lifetime of the variable which is being captured. In 1 and 3 the lambda is capturing the iteration variable str. In both for and foreach loops there is one iteration variable for the lifetime of the loop. When the lambda is executed at the end of the loop it executes with the final value: zar

In 2 you are capturing a local variable who's lifetime is a single iteration of the loop. Hence you capture the value at that time which is "foo"

The best reference I can you you to is Eric's blog post on the subject

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
0

For the people from google

I've fixed lambda bug using this approach:

I have changed this

for(int i=0;i<9;i++)
    btn.OnTap += () => { ChangeCurField(i * 2); };

to this

for(int i=0;i<9;i++)
{
    int numb = i * 2;
    btn.OnTap += () => { ChangeCurField(numb); };
}

This forces "numb" variable to be the only one for the lambda and also makes generate at this moment and not when lambda is called/generated < not sure when it happens.

Ghandhikus
  • 839
  • 2
  • 9
  • 12
0

The following happens in loop 1 and 3:

The current value is assigned to the variable str. It is always the same variable, just with a different value in each iteration. This variable is captured by the lambda. As the lambda is executed after the loop finishes, it has the value of the last element in your array.

The following happens in loop 2:

The current value is assigned to a new variable localStr. It is always a new variable that gets the value assigned. This new variable is captured by the lambda. Because the next iteration of the loop creates a new variable, the value of the captured variable is not changed and because of that it outputs "foo".

Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443