5

I was looking into Enumerable.ToLookup API which converts an enumerable sequence into a dictionary type data structure. More details can be found here:

https://msdn.microsoft.com/en-us/library/system.linq.enumerable.tolookup(v=vs.110).aspx

The only difference it carries from ToDictionary API is the fact that it won't give any error if the key selector results in duplicate keys. I need a comparison of deferred execution semantics of these two APIs. As far as I know, ToDictionary API results in immediate execution of the sequence i.e. it doesn't follow deferred execution semantics of LINQ queries. Can anyone help me with the deferred execution behavior of ToLookup API? Is it the same as ToDictionary API or there is some difference?

RBT
  • 24,161
  • 21
  • 159
  • 240

3 Answers3

5

Easy enough to test...

void Main()
{
    var lookup = Inf().ToLookup(i => i / 100);
    Console.WriteLine("if you see this, ToLookup is deferred"); //never happens
}

IEnumerable<int> Inf()
{
    unchecked
    {
        for(var i=0;;i++)
        {
            yield return i;
        }
    }
}

To recap, ToLookup greedily consumes the source sequence without deferring.

In contrast, the GroupBy operator is deferred, so you can write the following to no ill-effect:

var groups = Inf().GroupBy(i => i / 100); //oops

However, GroupBy is greedy, so when you enumerate, the entire source sequence is consumed.

This means that

groups.SelectMany(g=>g).First();

also fails to complete.

When you think about the problem of grouping, it quickly becomes apparent that when separating a sequence into a sequence of groups, it would be impossible to know if even just one of the groups were complete without completely consuming the entire sequence.

spender
  • 117,338
  • 33
  • 229
  • 351
  • I think this has a error in that it says first GroupBy is deferred, then it is greedy... is it both? – Nathan Bellowe Aug 13 '16 at 09:52
  • @NathanBellowe Are you confusing deferred with greedy? They are different behaviours. – spender Aug 13 '16 at 09:54
  • Loved your code snippet to help understand the immediate execution. Very nice. The program ultimately crashes because of infinite loop as Lookup API continues to create its internal structure before the WriteLine statement can be hit. – RBT Aug 14 '16 at 00:25
3

This was sort of covered here, but it was hard to find!

In short -- ToLookup does not defer execution!

  • ToLookup() -> immediate execution
  • GroupBy() (and other query methods) -> deferred execution
Nathan Bellowe
  • 4,662
  • 3
  • 15
  • 16
0

If you look at the reference implementation source code for both the Enumerable.ToDictionary() and the Enumerable.ToLookup() methods, you will see that both end up executing a foreach loop over the source enumerable. That's one way to confirm that the execution of the source enumerable is not deferred in both cases.

But I mean, the answer is pretty self evident in that if you start off with an enumerable, and the return value of the function is no longer an enumerable, then clearly, it must have been executed (consumed), no?

(That last paragraph was not accurate as pointed out by @spender in the comments)

sstan
  • 35,425
  • 6
  • 48
  • 66
  • The return value **is** enumerable. It's an `IEnumerable>` – spender Aug 13 '16 at 02:40
  • @spender: you must be referring to the `Enumerable.GroupBy()` function. – sstan Aug 13 '16 at 02:42
  • No, I'm referring to the `interface ILookup : IEnumerable>, IEnumerable` that is returned from `ToLookup`. – spender Aug 13 '16 at 02:44
  • I notice I got a downvote *after* my last edit. If there is still a blatant inaccuracy in my answer, don't be shy about dropping a comment. I won't get upset, I promise. Quite the opposite :) – sstan Aug 13 '16 at 10:47