20

I just encountered the following behavior:

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(() => {
        Debug.Print("Error: " + i.ToString());
    });
}

Will result in a series of "Error: x", where most of the x are equal to 50.

Similarly:

var a = "Before";
var task = new Task(() => Debug.Print("Using value: " + a));
a = "After";
task.Start();

Will result in "Using value: After".

This clearly means that the concatenation in the lambda expression does not occur immediately. How is it possible to use a copy of the outer variable in the lambda expression, at the time the expression is declared? The following will not work better (which is not necessarily incoherent, I admit):

var a = "Before";
var task = new Task(() => {
    var a2 = a;
    Debug.Print("Using value: " + a2);
});
a = "After";
task.Start();
Erwin Mayer
  • 18,076
  • 9
  • 88
  • 126
  • Why should they? They are asynchronous anyway. – Vlad Jun 15 '12 at 10:58
  • Possible duplicate "C# Captured Variable in Loop" http://stackoverflow.com/questions/271440/c-sharp-captured-variable-in-loop – Panagiotis Kanavos Jun 15 '12 at 10:58
  • IMHO, you end up asking 2 questions here - the 'real' one appears to be in the title (how to capture the value, such that the task runs on the value at loop time), but then the body of the question seems to focus on 'why do these things result in unexpected values' (the effect of the closure capture meaning they're all referencing the same variable). Thus, you end up with most of the answers explaining the behavior instead of answering your 'real' question (AFAICT :) – James Manning Jun 15 '12 at 11:17
  • True James, actually I changed my question title to better reflect what was my point, after reading the initial comments. – Erwin Mayer Jun 15 '12 at 11:27

4 Answers4

32

This has more to do with lambdas than threading. A lambda captures the reference to a variable, not the variable's value. This means that when you try to use i in your code, its value will be whatever was stored in i last.

To avoid this, you should copy the variable's value to a local variable when the lambda starts. The problem is, starting a task has overhead and the first copy may be executed only after the loop finishes. The following code will also fail

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(() => {
        var i1=i;
        Debug.Print("Error: " + i1.ToString());
    });
}

As James Manning noted, you can add a variable local to the loop and copy the loop variable there. This way you are creating 50 different variables to hold the value of the loop variable, but at least you get the expected result. The problem is, you do get a lot of additional allocations.

for (var i = 0; i < 50; ++i) {
    var i1=i;
    Task.Factory.StartNew(() => {
        Debug.Print("Error: " + i1.ToString());
    });
}

The best solution is to pass the loop parameter as a state parameter:

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(o => {
        var i1=(int)o;
        Debug.Print("Error: " + i1.ToString());
    }, i);
}

Using a state parameter results in fewer allocations. Looking at the decompiled code:

  • the second snippet will create 50 closures and 50 delegates
  • the third snippet will create 50 boxed ints but only a single delegate
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • 1
    The situation is quite known, it is described here: http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx – Ivan Golović Jun 15 '12 at 11:16
  • 2
    For the first loop, the 'right' fix (AFAICT) is to do the var i1 = i; inside the loop but before the Task.Factory.StartNew. With that change, each closure will refer to its own separate variable and you'll get the right effect. The state parameter avoids the need for the closure, though, so certainly more efficient, but not necessary if you just want the correct behavior. – James Manning Jun 15 '12 at 11:19
  • It's not that it doesn't work (that's the way the language works), it's that the lambdas may only start execution only AFTER the loop finishes – Panagiotis Kanavos Jun 15 '12 at 11:19
  • @James Manning, you are right, this creates a variable local to the loop only so there is no chance of capturing the wrong variable – Panagiotis Kanavos Jun 15 '12 at 11:20
  • @PanagiotisKanavos - based on Erwin's comment, if you change the first code chunk to make that change, it sounds like he'll accept it as the answer. – James Manning Jun 15 '12 at 11:21
  • Both your solutions will result in the same number of allocations: in the first case, it's 50 closure objects, and in the second case it's 50 boxed `int`s. So I'm not so sure the second one will be more efficient. – svick Jun 15 '12 at 11:58
  • The compiler creates more than one object per capture. This is described by Stephen Toub at http://blogs.msdn.com/b/pfxteam/archive/2012/02/03/10263921.aspx – Panagiotis Kanavos Jun 15 '12 at 12:16
  • After looking at decompiled code for both cases, the code with captures generates 50 closure and 50 delegates while the code using object state will create 50 boxed ints and ONLY a single action delegate. – Panagiotis Kanavos Jun 15 '12 at 12:35
  • For the object state example, there might only be one action delegate, but doesn't the implementation, internally, have to create a new object to store that state so that it can pass it once the action is invoked? – Sotirios Delimanolis Jul 18 '15 at 04:45
  • @SotiriosDelimanolis Yes the task has to allocate space to store the state parameter – Shannon Jun 30 '20 at 13:09
  • If somebody still cares about the great article from Eric Lippert, referenced by @IvanG here it is: https://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/#more-1441 – Jan Suchotzki Nov 08 '20 at 11:48
4

That's because you are running the code in a new thread, and the main thread immediately goes on to change the variable. If the lambda expression were executed immediately, the entire point of using a task would be lost.

The thread doesn't get its own copy of the variable at the time the task is created, all the tasks use the same variable (which actually is stored in the closure for the method, it's not a local variable).

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
3

Lambda expressions do capture not the value of the outer variable but a reference to it. That is the reason why you do see 50 or After in your tasks.

To solve this create before your lambda expression a copy of it to capture it by value.

This unfortunate behaviour will be fixed by the C# compiler with .NET 4.5 until then you need to live with this oddity.

Example:

    List<Action> acc = new List<Action>();
    for (int i = 0; i < 10; i++)
    {
        int tmp = i;
        acc.Add(() => { Console.WriteLine(tmp); });
    }

    acc.ForEach(x => x());
Alois Kraus
  • 13,229
  • 1
  • 38
  • 64
  • Do you mean creating a copy in the lambda expression will work? Currently it doesnt: Using var a2 = a; Logging.Print("Using value: " + a2); still retruns "Using value: After". – Erwin Mayer Jun 15 '12 at 11:07
  • Sorry. You need to place the copy outside the lambda to make it work. – Alois Kraus Jun 15 '12 at 11:12
2

Lambda expressions are by definition lazily evaluated so they will not be evaluated until actually called. In your case by the task execution. If you close over a local in your lambda expression the state of the local at the time of execution will be reflected. Which is what you see. You can take advantage of this. E.g. your for loop really don't need a new lambda for every iteration assuming for the sake of the example that the described result was what you intended you could write

var i =0;
Action<int> action = () => Debug.Print("Error: " + i);
for(;i<50;+i){
    Task.Factory.StartNew(action);
}

on the other hand if you wished that it actually printed "Error: 1"..."Error 50" you could change the above to

var i =0;
Func<Action<int>> action = (x) => { return () => Debug.Print("Error: " + x);}
for(;i<50;+i){
    Task.Factory.StartNew(action(i));
}

The first closes over i and will use the state of i at the time the Action is executed and the state is often going to be the state after the loop finishes. In the latter case i is evaluated eagerly because it's passed as an argument to a function. This function then returns an Action<int> which is passed to StartNew.

So the design decision makes both lazily evaluation and eager evaluation possible. Lazily because locals are closed over and eagerly because you can force locals to be executed by passing them as an argument or as shown below declaring another local with a shorter scope

for (var i = 0; i < 50; ++i) {
    var j = i;
    Task.Factory.StartNew(() => Debug.Print("Error: " + j));
}

All the above is general for Lambdas. In the specific case of StartNew there's actually an overload that does what the second example does so that can be simplified to

var i =0;
Action<object> action = (x) => Debug.Print("Error: " + x);}
for(;i<50;+i){
    Task.Factory.StartNew(action,i);
}
Rune FS
  • 21,497
  • 7
  • 62
  • 96