-1

LINQ evaluates clauses from right to left? That's why seems so many articles which explains "Lazy evaluation" using a Take operation in the end? The following example, Code Snippet 2 a lot faster than Code Snippet 1 because it didn't do "ToList"

Code Snippet 1 (Takes about 13000 msec)

        var lotsOfNums = Enumerable.Range(0, 10000000).ToList();

        Stopwatch sw = new Stopwatch();
        sw.Start();

        // Get all the even numbers
        var a = lotsOfNums.Where(num => num % 2 == 0).ToList();

        // Multiply each even number by 100.
        var b = a.Select(num => num * 100).ToList();

        var c = b.Select(num => new Random(num).NextDouble()).ToList();

        // Get the top 10
        var d = c.Take(10);

        // a, b, c and d have executed on each step.
        foreach (var num in d)
        {
            Console.WriteLine(num);
        }

        sw.Stop();
        Console.WriteLine("Elapsed milliseconds: " + sw.ElapsedMilliseconds);

Code Snippet 2 (3 msec)

        sw.Reset();
        sw.Start();

        var e = lotsOfNums.Where(num => num % 2 == 0).Select(num => num * 100).Select(num => new Random(num).NextDouble()).Take(10);
        foreach (var num in e)
        {
            Console.WriteLine(num);
        }
        sw.Stop();
        Console.WriteLine("Elapsed milliseconds: " + sw.ElapsedMilliseconds);
        Console.Read();

However, for Code Snippet 2, I find the relative position of "Take" is not relevant?

To be specific, I changed from: var e = lotsOfNums.Where(num => num % 2 == 0).Select(num => num * 100).Select(num => new Random(num).NextDouble()).Take(10);

To:

   var e = lotsOfNums.Take(10).Where(num => num % 2 == 0).Select(num => num * 100).Select(num => new Random(num).NextDouble());

There's no difference in performance?

Also worth noting, if you move the NextDouble to far right, since LINQ evaluates left to right, your result list will be empty and also Select(NextDouble) forces all subsequent clauses in left to loop thru the whole list, it will take much longer time to evaluate.

  var e = lotsOfNums.Select(num => new Random(num).NextDouble()).Where(num => num % 2 == 0).Select(num => num * 100).Take(10);
user3761555
  • 851
  • 10
  • 21
  • 5
    There are seven questions here and yet I cannot figure out what this question is asking. Can you clarify it? Try to narrow it down to a single, clear question. – Eric Lippert Sep 26 '19 at 16:46
  • 1
    [Order of LINQ extension methods does not affect performance?](https://stackoverflow.com/questions/10110013/order-of-linq-extension-methods-does-not-affect-performance) – Tim Schmelter Sep 26 '19 at 16:50
  • 1
    Side note: don't call `new Random(...)` in your projections. Instantiate it once and use that instance in your projections. – madreflection Sep 26 '19 at 17:35
  • Reason why new Random(newseed) is newseed. – user3761555 Sep 26 '19 at 18:10
  • 1
    That's not good reasoning. If you want to make the same sequence of random numbers every time for testing purposes, make a *single* Random for each test, seeded *at the start of each test*. Don't generate a new seeded Random *before fetching every random number*. – Eric Lippert Sep 26 '19 at 18:37

2 Answers2

8

LINQ evaluates clauses from right to left?

No, clauses are evaluated left to right. Everything is evaluated left to right in C#.

That's why seems so many articles which explains "Lazy evaluation" using a Take operation in the end?

I don't understand the question.


UPDATE: I understand the question. The original poster believes incorrectly that Take has the semantics of ToList; that it executes the query, and therefore goes at the end. This belief is incorrect. A Take clause just appends a Take operation to the query; it does not execute the query.


You must put the Take operation where it needs to be. Remember, x.Take(y).Where(z) and x.Where(z).Take(y) are very different queries. You can't just move a Take around without changing the meaning of the query, so put it in the right place: as early as possible, but not so early that it changes the meaning of the query.

Position of "NextDouble" select clause matters?

Matters to who? Again, I don't understand the question. Can you clarify it?

Why codesnippet 1 and codesnippet 2 has same performance stats?

Since you have not given us your measurements, we have no basis upon which to make a comparison. But your two code samples do completely different things; one executes a query, one just builds a query. Building a query that is never executed is faster than executing it!

I thought "ToList" force early evaluation thus make things slower?

That's correct.

There's no difference in performance? (between my two query constructions)

You've constructed two queries; you have not executed them. Construction of queries is fast, and not typically worth measuring. Measure the performance of the execution of the query, not the construction of the query, if you want to know how fast the query executes!

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Thanks, I edited the question to make it clearer. If LINQ evaluates from left to right. How come I dont see any difference by shuffling the clauses? Move "Take" from far right to left. Seems make no difference. – user3761555 Sep 26 '19 at 16:55
  • 1
    @user3761555: Print out the results. Moving a Take from the end to the start changes the output! – Eric Lippert Sep 26 '19 at 16:57
  • 1
    @user3761555: If it's not clear, try it by hand. Start with the numbers one to a hundred. Take the first ten of them. Then filter out the odd ones. What is the result? Now start with the numbers one to a hundred. Filter out the odd ones. Then take the first ten. What is the result? – Eric Lippert Sep 26 '19 at 16:58
  • Yes, if you move "NextDouble" to far right, then no numbers in final list also query MUCH slower var e = lotsOfNums.Select(num => new Random(num).NextDouble()).Where(num => num % 2 == 0).Select(num => num * 100).Take(10); – user3761555 Sep 26 '19 at 17:01
  • @user3761555: Are you testing performance of building the query or executing the query? – Eric Lippert Sep 26 '19 at 17:03
2

I think you seem to have the impression that .Take() forces evaluation, which it does not. You're seeing similar performance regardless of the position of Take() because your query isn't actually being evaluated at all. You have to add a .ToList() at the end (or maybe iterate over the result) to test the performance of the query you've built.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
  • This i understand yes. However seems there is no second example to illustrate lazy eval and performance gain besides examples using "Take". If I don't have "Take" in my queries, only where/select, then unless I do ToList or count(), there's really not much performance tuning opportunities? – user3761555 Sep 26 '19 at 17:04
  • 1
    @user3761555: Is your question really "where do I put a Take to maximize the performance of my query execution?" The answer to that question is (1) get the position of the Take correct with respect to all Where clauses, and (2) then do the Take as early as possible. To optimize a query with no Take, similarly figure out how to place the Where as early as possible. In particular, Where should always go before OrderBy. – Eric Lippert Sep 26 '19 at 17:07
  • Am I the only one who thinks this is a case of premature optimization spun way out of control? – madreflection Sep 26 '19 at 17:09
  • 1
    @madreflection: I don't think we have a case of premature optimization. I think we have a case of not understanding the semantics of query execution, and understanding those semantics correctly can lead to some meaningful optimizations. – Eric Lippert Sep 26 '19 at 17:10
  • @EricLippert: I agree with not understanding the semantics of query execution (totally on board there) but I feel like this needs to be understood, first, without performance tuning as a primary goal. – madreflection Sep 26 '19 at 17:13
  • @madreflection: I think we are in agreement on this point. :) – Eric Lippert Sep 26 '19 at 17:14
  • I think most people's first interactions with LINQ, and indeed many of the tutorials online concerning it, are centered around database queries via Entity Framework. Which is where I imagine his confusion with the term 'lazy' is coming from. Just speculation on my part, but I think its a reasonable assumption. – David Royce Sep 26 '19 at 17:25
  • There are two "lazies" that can get conflated, "lazy evaluation" and "lazy loading". OP is talking about the former, which applies to how iterators work. The latter applies to when data is actually retrieved from the data store, and wasn't mentioned here at all. Of all the confusions here, I think we can rule that one out. – madreflection Sep 26 '19 at 17:32
  • @user3761555: `Take` is often used in examples of the benefits of lazy evaluation because it's simple and easy to grok. But you can get the same performance benefit that Take yields without using LINQ. The benefit of LINQ largely comes from its composability: I can compose a query that represents specific values in the database, and put that into a variable. Then I can throw `.Count()` at the end to see _how many_ of those values are in the database, and then use `.Skip(...).Take(...).ToList()` on the same variable to get a page of that data in separate query. – StriplingWarrior Sep 26 '19 at 17:38