0

When I add .ToList() at the end of the Select() statement it works as expected. What's the difference between IEnumerable and List in this case?

class Foo{
    public string Name { get; set; }
    public Guid Id { get; set; }
}
class Program
{
    static void Main(string[] args)
    {
        var list = new List<string>(){ "first", "last" };
        var bar = list.Select(x => new Foo {
            Name = x,
            Id = Guid.NewGuid()
        }); //.ToList();
        Console.WriteLine("bar Count = " + bar.Count());
        Console.WriteLine();
        for (int i = 0; i < 3; i++) { 
            Console.WriteLine(bar.First().Name + " " + bar.First().Id);
            Console.WriteLine(bar.Last().Name + " " + bar.Last().Id);
            Console.WriteLine();
        }
    }
}

Result:

examples Count = 2

first 1fabd003-340c-44a4-81ca-1908f206ccdb last 213f40c9-0705-4676-ad1f-73de21c5bc62

first 9bc117e5-1f10-4855-af95-46258b39955c last cd787227-12d9-4e25-88bc-634369d85671

first c787e081-8dd3-475c-bf24-44e5f103ed3a last 7a72080d-2dee-47b1-a55d-af0a7665d0ea

Result with .ToList()

bar Count = 2

first 3c1d4f1d-eecd-438a-86a4-e7db0680cad3 last cba1867d-d246-477e-b938-0f9cce457a5c

first 3c1d4f1d-eecd-438a-86a4-e7db0680cad3 last cba1867d-d246-477e-b938-0f9cce457a5c

first 3c1d4f1d-eecd-438a-86a4-e7db0680cad3 last cba1867d-d246-477e-b938-0f9cce457a5c

Pavel Anikhouski
  • 21,776
  • 12
  • 51
  • 66
Casi
  • 101
  • 1
  • 5
  • The question is not entirely clear, but I'm guessing it has something to do with the fact that a `Select` LINQ query defers action until enumerated. You're enumerating it when you add `ToList` – JSteward Jan 31 '20 at 21:41
  • It seems to be that you're confusing `IEnumerable` and collection. `IEnumerable` is not really a collection but more like a draft of one, it contains `IEnumerator` - the object that *knows* how to enumerate collections. Only when we execute a function that requires to enumerate it does it actually execute anything and `.ToList()` enumerates it and returns a collection while `.First` executes it and returns first element – Fabjan Jan 31 '20 at 21:43
  • 8
    I've been saying this for over a decade and it seems like every day it still needs to be repeated. **A query object represents a question, not an answer**. You can ask the same question twice and get a different answer! In your original code you are asking the question three times. In your modified code you are asking it once and caching the result. – Eric Lippert Jan 31 '20 at 21:43
  • 2
    Similarly we could say `Func f = Guid.NewGuid;` and we would discover that `Console.WriteLine($"{f()}{f()}{f()}");` and `g = f(); Console.WriteLine($"{g}{g}{g}");` have different behaviour. Does this case surprise you, or would that be a difference that you expect? – Eric Lippert Jan 31 '20 at 21:49

4 Answers4

5

When you use Select, you are defining a lazy enumerable, meaning you are defining rules on how the values in the enumerable are generated but are not yet actually generating the values themselves. Lazy enumerables (a.k.a. "deferred") are blueprints and nothing more - they contain no data and execute no code.

Following this, when you call bar.First() and bar.Last(), you are triggering the enumerable to start generating some values. Each time you do this, it is going to generate completely new values. That means that every time you call something on bar, it is going to generate new values with new calls to Guid.NewGuid(), and naturally that will result in new GUIDs being generated every time.

However, if you call ToList() on your enumerable, you are generating the values in the enumerable and then caching them in a List. Now when you call bar.First() and bar.Last(), you are referring to the static values of the first and last elements in the List which do not change.

Abion47
  • 22,211
  • 4
  • 65
  • 88
2

System.Linq uses deferred execution. When you use IEnumerable, a new Foo values are generated every time, when you iterate over enumerable in a loop, with the new Id value for every instance. When you call First() or Last(), you get a new instance with different Guid value.

When ToList() is used, the Linq expression is evaluated immediately and Foo instances remains the same in collection, with the Id value already calculated. Your loop iterates over the collection with the same values, First() and Last() returns the same instance at every for loop iteration. Therefore you are getting the same Guid values every time.

You can put a breakpoint in Id property setter an see how many times the value is set in case of IEnumerable and List

Pavel Anikhouski
  • 21,776
  • 12
  • 51
  • 66
1

You have the creation of your objects inside your IEnumerable, everytime you enumerate it, it creates new items. Your Calls to First() and Last() might generate two different iterations, generation different content.

If you store it before with ToList, the content stays the same on each iteration.

Holger
  • 2,446
  • 1
  • 14
  • 13
1
var bar = list.Select(x => new Foo {
    Name = x,
    Id = Guid.NewGuid()
});

Every time the GetEnumerator method on the bar value is called it will produce an enumerator that executes the function for each element in list. Since you have a Guid.NewGuid() in the middle, it is unstable.

Console.WriteLine("bar Count = " + bar.Count());
Console.WriteLine(bar.First().Name + " " + bar.First().Id);
Console.WriteLine(bar.Last().Name + " " + bar.Last().Id);

Each of these lines calls an extension method that calls GetEnumerator(), which resulted in 5 different GUIDs being generated (First() is smart and stops after the first one).

var bar = list.Select(x => new Foo {
    Name = x,
    Id = Guid.NewGuid()
}).ToList();

This makes a List<Foo> from the result of running the SelectEnumerable once. Each call to GetEnumerator from the extension methods gets the same stable data, because the List already has its element values.

bartonjs
  • 30,352
  • 2
  • 71
  • 111