4

Let's start with some source data to query:

int[] someData = { 1, 2 };

After running the following code, things work as I expect: a contains 2 elements which boil down to 1 and 2 pulled from someData.

List<IEnumerable<int>> a = new List<IEnumerable<int>>();
a.Add(someData.Where(n => n == 1));
a.Add(someData.Where(n => n == 2));

But this next code, which does exactly the same thing only in a loop, does not work as expected. When this code completes, b contains 2 elements but they are both identical - pointing to 2. On the 2nd loop, it modifies the first element of b.

List<IEnumerable<int>> b = new List<IEnumerable<int>>();
for (int i = 1; i <= 2; ++i)
{
    b.Add(someData.Where(n => n == i));
}

Why is this happening and how can I make the loop version behave like the first version?

tenfour
  • 36,141
  • 15
  • 83
  • 142
  • 3
    See http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx – dlev Nov 01 '11 at 20:40
  • 1
    @tenfour: You're accessing a modified closure, that is, the lambda expression is capturing the variable `i` which is changing. – James Michael Hare Nov 01 '11 at 20:42
  • @Marc yes of course this is just a minimal example to demonstrate the behavior; it's not practical to put the real code in my question. – tenfour Nov 01 '11 at 20:49
  • @Marc, no, AddRange is not what he means. It is a list of sequences, which is precisely what he is adding with `Add`. If it was instead a list of integers, he actually wouldn't be having this problem at all, as then he *would* use `AddRange` and the sequence would be fully evaluated inside the loop. – Anthony Pegram Nov 01 '11 at 20:49
  • @Marc, look at the type of the list. – Anthony Pegram Nov 01 '11 at 20:54
  • @AnthonyPegram Ahh, missed the `IEnumerable`. – Marc Nov 01 '11 at 20:55

4 Answers4

7

Jon Skeet has a nice answer here

You need to assign i to a temp variable and use that in the Linq query

List<IEnumerable<int>> b = new List<IEnumerable<int>>();
for (int i = 1; i <= 2; ++i)
{
    int temp = i;
    b.Add(someData.Where(n => n == temp));
}
Community
  • 1
  • 1
Aducci
  • 26,101
  • 8
  • 63
  • 67
4

Your problem is lazy evaluation. You add an Enumerable to b that represents someData.Where(n => n == i). This is evaluated whenever you look into an element of b with the value i has at that time.

You want to manifest the enumerable by calling ToArray() or ToList() on it.

for (int i = 1; i <= 2; ++i)
{
    b.Add(someData.Where(n => n == i).ToArray());
}

Alternatively you can reduce the scope of the captured variable:

for (int i = 1; i <= 2; ++i)
{
    int localI=i;
    b.Add(someData.Where(n => n == localI));
}

Then you still have lazily evaluated enumerables(which shows when you modify someData), but each has a different i.

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262
3

For each declaration of the where clause in the loop, there is a separate Func<int, bool> instance that is being passed into it. Since the value of i is being passed to the lambda function associated with each instance of Func<int, bool> (Func<int, bool> being essentially a delegate), i is a captured variable that is shared amongst each instance of Func<int, bool>.

In other words, i must be "kept alive", even outside the scope of the loop, in order for its value to be evaluated whenever any instance of Func<int, bool> is invoked. Normally, this wouldn't be a problem except that the invocation will not occur until it is necessary (i.e. there is need to enumerate through the results of the query that the delegate instance is passed to. An example would be a foreach loop: only then would the delegate be invoked in order to determine the results of the query for the purposes of the iteration).

This means that the LINQ queries declared in the loop won't really be made until some time after the conclusion of the for loop, meaning that i would be set to the value of 2. As a result, you are actually doing something like this:

 b.Add(someData.Where(n => n == 2)); 
 b.Add(someData.Where(n => n == 2));

To prevent this from happening, for each iteration in the loop, you need to declare a separate instance of the integer type and make it equivalent to i. Pass this to the lambda function declared in the iteration and each instance of Func<int, bool> will have a separate captured variable whose value will not be modified after each subsequent iteration. For example:

 for (int i = 1; i <= 2; ++i)  
 {
      int j = i;  
      b.Add(someData.Where(n => n == j));  
 } 
Jimmy W
  • 539
  • 3
  • 11
1

This is a slightly hidden variation on the captured loop-var problem.

You could solve it like this:

List<IEnumerable<int>> b = new List<IEnumerable<int>>();
for (int i = 1; i <= 2; ++i)
{
    int j = i;  // essential
    b.Add(someData.Where(n => n == j));
}

But slightly more intuitive would be

List<IEnumerable<int>> b = new List<IEnumerable<int>>();
for (int i = 1; i <= 2; ++i)
{        
    b.Add(someData.Where(n => n == i).ToList());
}

What happens in you original code is that the variable i is captured (closed-over) and a reference is stored in the lambdas. And you results are IEnumerables which menas deferred execution. The value of i is only fetched when the results are shown/examined, and by that time it is 2.

H H
  • 263,252
  • 30
  • 330
  • 514