4

I just found this lambda expression:

myCustomerList.GroupBy(cust => cust.CustomerId).Select(grp => grp.First());

Correct me if I am wrong, but with this lambda you can distinct the myCustomerList on the CustomerId and that's exaclty what I need. But I am trying to figure out how it works.

The first step is the groupby: this result in a dictionary, IGouping<long, Customer> with the CustomerId as the key of the dictionary.

Second a select takes place and this is the part I don't get. The select selects a customer, but how can it select a Customer from a dictionary? You need a key for this, because of the group by. Where's that key? And how is First() helping here?

Can you tell me in detail how the last part works?

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
Martijn
  • 24,441
  • 60
  • 174
  • 261
  • There is a similar question [here][1] [1]: http://stackoverflow.com/questions/436954/whos-on-dictionary-first – Marshal Sep 06 '11 at 09:49

2 Answers2

4

It's not selecting it from the dictionary - it's saying for each grouping in the result of GroupBy, select the first entry. Note that IGrouping<TKey, TElement> implements IEnumerable<TElement>.

Basically a group has two things:

  • A key
  • A list of elements

This is selecting the first element from each group.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • But what happens then in the background. When I see this it suggests - in my opinion - that a dictionary is made (which implements IEnumerable). And when that dictionary is created, the select is executed. But this is not the case? – Martijn Sep 06 '11 at 11:05
  • @Martijn: No, that's not the case. The grouping still isn't evaluated until something tries to read from it - although at that point the input sequence is completely consumed. It's probably best to just point you at my Edulinq blog post for GroupBy: http://msmvps.com/blogs/jon_skeet/archive/2011/01/01/reimplementing-linq-to-objects-part-21-groupby.aspx – Jon Skeet Sep 06 '11 at 11:10
  • I still don't get it completely. The groupBy isn't executec when the select takes place? But when is it executed? Since the First() method does select the first object of each group. – Martijn Sep 06 '11 at 11:18
  • @Martijn: If you don't use the results of the `Select()`, nothing's going to ask that for its first element - which means it's not going to ask the grouping for its first element. You need to read up on deferred execution: http://msmvps.com/blogs/jon_skeet/archive/2010/09/03/reimplementing-linq-to-objects-part-2-quot-where-quot.aspx gives some of the details. – Jon Skeet Sep 06 '11 at 11:23
  • I see, when (for example) `ToList()` is called at the end, everything gets executed. And when it is executed, I come back to my first comment, what happens? A dictionary is created (which implements IEnumerable) and based on this result, the select is executed with the First() method? And if this is the case, what is the value of the 'Key` property for the dictionary? – Martijn Sep 06 '11 at 12:49
  • @Martijn: The dictionary which is created internally would be something like `Dictionary` - each grouping represents an entry in that dictionary; the key and the list of corresponding values. So `First` is called on *each* of the lists of values. The Key property here is the customer ID. I provide a sample implementation in the blog post referenced in an earlier comment - I suggest you work through that. – Jon Skeet Sep 06 '11 at 12:53
3

Lets says your collection is:

{Name=a, CustomerId=1}
{Name=a, CustomerId=1}
{Name=b, CustomerId=2}
{Name=b, CustomerId=2}

After group by it becomes

{ key = 1, Values = {Name=a, CustomerId=1}, {Name=a, CustomerId=1} }
{ key = 2, Values = {Name=a, CustomerId=2}, {Name=a, CustomerId=2} }

After last select (i.e select first from the Values in the above notation it becomes:

{Name=a, CustomerId=1}
{Name=a, CustomerId=2}

Hence it is distinct customer based on the ID.

Ankur
  • 33,367
  • 2
  • 46
  • 72