0

I found an interesting behavior of the LINQ queries result while working with C#. I try to figure this out, but couldn't find a proper explanation of why this works as it is. So I'm asking here, maybe someone can give a good explanation (of the inner working that leads to this behaviour) to me or maybe some links.

I have this class:

    public class A
    {
        public int Id { get; set; }

        public int? ParentId { get; set; }
    }

And this object:

var list = new List<A> 
            { 
                new A { Id = 1, ParentId = null }, 
                new A { Id = 2, ParentId = 1 }, 
                new A { Id = 3, ParentId = 1 }, 
                new A { Id = 4, ParentId = 3 },
                new A { Id = 5, ParentId = 7 }
            };

And my code, that works with this object:

var result = list.Where(x => x.Id == 1).ToList();
var valuesToInsert = list.Where(x => result.Any(y => y.Id == x.ParentId));

Console.WriteLine(result.Count); // 1
Console.WriteLine(valuesToInsert.Count()); //2

foreach (var value in valuesToInsert)
{
    result.Add(value);
}

Console.WriteLine(valuesToInsert.Count()); //3. collection (and its count) was changed inside the foreach loop
Console.WriteLine(result.Count); //4

So, Count of result variable is 1, valuesToInsert count is 2, and after the foreach loop (which doesn't change the valuesToInsert explicitly) count of the valuesToInsert is changing. And, although at the start of the foreach count of valuesToInsert was two, foreach makes three iterations.

So why value of this Enumerable can be changed inside foreach? And, for example, if I use this code to change the value of Enumerable:

var testEn = list.Where(x => x.Id == 1);
foreach (var x in testEn)
{
    list.Add(new A { Id = 1 });
}

I get the System.InvalidOperationException: 'Collection was modified; enumeration operation may not execute.'. What's the differences between them? Why one collection can be modified and other can not?

P.S. If I add ToList() like this:

var valuesToInsert = list.Where(x => result.Any(y => y.Id == x.ParentId)).ToList();

Or like this:

foreach (var value in valuesToInsert.ToList())

It makes only two iterations.

helgez
  • 157
  • 1
  • 7
  • The question itself aside, the code seems contrived for what it intends to achieve. If you want the parent and its children, you can do `list.Where(x => x.Id == 1 || x.ParentId == 1)`. If you only want the children, `list.Where(x => x.ParentId == 1)` – Flater Apr 12 '20 at 15:15
  • It is simplified code, in reality nesting could be **not** only two level. – helgez Apr 12 '20 at 15:25
  • Are you aware that in a statement like `var testEn = list.Where(x => x.Id == 1);`, `testEn` is just a "view" of `list` with only the elements that match the predicate, and that no new lists are created until you call `ToList()`? Looping through `testEn` is _just like_ looping through `list`, but with a condition. – Sweeper Apr 12 '20 at 15:28
  • "No new lists" - you mean no new collections in memory? – helgez Apr 12 '20 at 15:44

3 Answers3

0

There are multiple questions here:

So, after first query Count of result variable is 1, after second query valuesToInsert count is 2, and after the foreach loop (which doesn't change the valuesToInsert explicitly) count of the valuesToInsert is changing.

It is as expected because the reference we have in the variable is the same to valuesToInsert variable is holding. So the object is same but multiple references are pointing to the same one.

Your second question:

So why value of this Enumerable can be changed inside foreach?

The IEnumerable collection is readonly when we have the collection as reference of type IEnumerable but when we call ToList() method on it we have a copy of the collection which is pointing to the same original collection but we now can add more items to the collection.

When we have collection as IEnumerable the collection can be iterated and read but adding more items while enumerating would fail as the collection is supposed to be read sequentially.

Thrid:

It makes only two iterations.

Yes because at that moment of time whatever the number of items were in the colletion were enumerated and reference to it got stored as a new List while it is still pointing to the same object i.e. IEnumerable but now we can add more items due to its type as List.

See:

var result = list.Where(x => x.Id == 1).ToList(); 
// result is collection which can be modified, items add, remove etc

var result = list.Where(x => x.Id == 1);
 // result is IEnumerable which can be iterated to get items one by one
 // modifying this collection would error out normally
Ehsan Sajjad
  • 61,834
  • 16
  • 105
  • 160
  • >When we have collection as IEnumerable the collection can be iterated and read but adding more items while enumerating would fail as the collection is supposed to be read sequentially. What do you mean by "adding would faild while enumerating"? We obviously cant use the `.Add()` method on IEnumerable, but in my code objects **are** adding to IEnumerable collection while enumerating (at the start of the foreach collection `valuesToInsert` has only two objects, then it grows to three, and foreach, in total, making **three** iterations). – helgez Apr 12 '20 at 15:21
  • But you are adding using list reference – Ehsan Sajjad Apr 12 '20 at 16:37
0

The valuesToInsert collection has a reference to the result collection in the Where clause:

var valuesToInsert = list.Where(x => result.Any(y => y.Id == x.ParentId));

Because an Enumerable works using yield return it uses the most recent result collection for each item produced.

If you don't want this behavior you should first evaluate valueToInsert using ToList()

foreach (var value in valuesToInsert.ToList())

Regarding 'Collection was modified' exception. You can't change an enumerable while it is being enumerated. Now the result collection is changed but not while it is being enumerated; it is only enumerated each time the for each loop requests a new item. (This makes your algorithm for adding the children less efficient which will become noticable for huge collections.)

Wouter
  • 2,540
  • 19
  • 31
  • My algorithm should add not only children, but children of children, and etc. You know how to write it **more** efficient? – helgez Apr 12 '20 at 16:03
0

This block of code:

foreach (var value in valuesToInsert)
{
    result.Add(value);
}

...is transformed by the C# compiler to this equivalent block of code:

IEnumerator<A> enumerator = valuesToInsert.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        var value = enumerator.Current;
        result.Add(value);
    }
}
finally
{
    enumerator.Dispose();
}

The enumerator returned by a List is invalidated when a the List is mutated, which means that the method MoveNext will throw an InvalidOperationException if it is invoked after a mutation. In this case the valuesToInsert is not a List, but an enumerable returned by the LINQ method Where. That method works by enumerating the enumerator it obtains lazily by its source, which in this case is the list. So enumerating one enumerator causes indirectly the enumeration of another, that is hidden deeper in the magic LINQ chain. In the first case the list is not mutated inside the enumeration block, so no exception is thrown. In the second case it is mutated, causing an exception that is propagated from the one MoveNext to the other, and eventually thrown by the foreach statement.

It's worth noting that this behavior is not part of the public contract of the List class, so it could be changed in a future version of .NET. So you should probably avoid depending on this behavior for the correctness your program . This warning is not theoretical. A change like this has already happened with the Dictionary class in .NET Core 3.0.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • Which part of behaviour could be changed? That the method MoveNext will throw an `InvalidOperationException` if it is invoked after a mutation? – helgez Apr 13 '20 at 07:16
  • @helgez yeap, this. In a future version of .NET the exception may not be thrown. – Theodor Zoulias Apr 13 '20 at 07:44