20

I'm surprised that it apparently doesn't matter whether i prepend or append LINQ extension methods.

Tested with Enumerable.FirstOrDefault:

  1. hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault();
  2. hugeList.FirstOrDefault(x => x.Text.Contains("10000"));

    var hugeList = Enumerable.Range(1, 50000000)
        .Select(i => new { ID = i, Text = "Item" + i });
    
    var sw1 = new System.Diagnostics.Stopwatch();
    var sw2 = new System.Diagnostics.Stopwatch();
    
    sw1.Start();
    for(int i=0;i<1000;i++)
        hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault();
    sw1.Stop();
    
    sw2.Start();
    for(int i=0;i<1000;i++)
        hugeList.FirstOrDefault(x => x.Text.Contains("10000"));
    sw2.Stop();
    
    var result1 = String.Format("FirstOrDefault after: {0} FirstOrDefault before: {1}", sw1.Elapsed,  sw2.Elapsed);
    //result1: FirstOrDefault after: 00:00:03.3169683 FirstOrDefault before: 00:00:03.0463219
    
    sw2.Restart();
    for (int i = 0; i < 1000; i++)
        hugeList.FirstOrDefault(x => x.Text.Contains("10000"));
    sw2.Stop();
    
    sw1.Restart();
    for (int i = 0; i < 1000; i++)
        hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault();
    sw1.Stop();
    
    var result2 = String.Format("FirstOrDefault before: {0} FirstOrDefault after: {1}", sw2.Elapsed, sw1.Elapsed);
    //result2: FirstOrDefault before: 00:00:03.6833079 FirstOrDefault after: 00:00:03.1675611
    
    //average after:3.2422647 before: 3.3648149 (all seconds)
    

I would have guessed that it would be slower to prepend Where since it must find all matching items and then take the first and a preceded FirstOrDefault could yield the first found item.

Q: Can somebody explain why i'm on the wrong track?

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 2
    possible duplicate of [What are the benefits of a Deferred Execution in LINQ?](http://stackoverflow.com/questions/7324033/what-are-the-benefits-of-a-deferred-execution-in-linq) – Kirk Woll Apr 11 '12 at 16:28
  • 2
    @KirkWoll: How is that even remotely a duplicate? The OP here is asking about order of operations; your cited duplicate is asking about deferred execution. – Robert Harvey Apr 11 '12 at 16:31
  • 5
    @Robert, I don't think it's anything fancier than that `.FirstOrDefault` evaluates the deferred query constrained by the `.Where` and consumes only one result. Becasue of deferred execution, the entire contents of the sequence does not have to be evaluated. To me, it's a duplicate because I assumed the OP here didn't know about deferred execution. This behavior seems obvious to me once you have that knowledge. – Kirk Woll Apr 11 '12 at 16:32
  • 1
    Excellent question! this needs more vote ups.. – nawfal Nov 05 '12 at 03:50
  • 2
    @KirkWoll: A late reply to your duplicate comment: imho a duplicate is not a duplicate if it can explain another question indirectly but if the question is the same. If someone asks "Why does a bird not fall from the sky", is that a duplicate from "What are the benefits of wings"? No. You need to know already the answer to your question to find that "duplicate", so it's not one ;) – Tim Schmelter Oct 16 '13 at 11:23

2 Answers2

49

I would have guessed that it would be slower to prepend Where since it must find all matching items and then take the first and a preceded FirstOrDefault could yield the first found item. Can somebody explain why i'm on the wrong track?

You are on the wrong track because your first statement is simply incorrect. Where is not required to find all matching items before fetching the first matching item. Where fetches matching items "on demand"; if you only ask for the first one, it only fetches the first one. If you only ask for the first two, it only fetches the first two.

Jon Skeet does a nice bit on stage. Imagine you have three people. The first person has a shuffled pack of cards. The second person has a t-shirt that says "where card is red". The third person pokes the second person and says "give me the first card". The second person pokes the first person over and over again until the first person hands over a red card, which the second person then hands to the third person. The second person has no reason to keep poking the first person; the task is done!

Now, if the second person's t-shirt says "order by rank ascending" then we have a very different situation. Now the second person really does need to get every card from the first person, in order to find the lowest card in the deck, before handing the first card to the third person.

This should now give you the necessary intuition to tell when order does matter for performance reasons. The net result of "give me the red cards and then sort them" is exactly the same as "sort all the cards then give me the red ones", but the former is much faster because you do not have to spend any time sorting the black cards that you are going to discard.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 2
    Thank you for your vivid explanation. Bu i've still problems to understand the _nature_ of deferred execution. I would have no idea if i would have to implement such a thing myself. How is it implememented, have you already written about in your blog? And finally, how can i know when a method is executed directly(like `Any`) or deferred(like `Where`)? Is everything that uses/returns an `IEnumerable` executed deferred? – Tim Schmelter Apr 11 '12 at 17:29
  • 2
    @TimSchmelter: Think about implementing an `IEnumerable` using yield. The elements are not returned all at once but on request. Chaining `IEnumerable` functions together is relatively simple. If you want a better feel for how LINQ works, I recommend reading Jon Skeet's excellent [Edulinq](https://msmvps.com/blogs/jon_skeet/archive/tags/Edulinq/default.aspx) series, where he re-implements LINQ. [`Where`](https://msmvps.com/blogs/jon_skeet/archive/2010/09/03/reimplementing-linq-to-objects-part-2-quot-where-quot.aspx) is of particular relevance. – Brian Apr 11 '12 at 17:49
  • 2
    @TimSchmelter: I am not convinced that the concept of "deferred execution" is sufficiently clear in the first place; it seems to cause a lot of confusion. In a sense **all** execution is deferred; no execution happens **until a method is called**. The question is then *what does that method do*? The contract of `IEnumerator` is "I'll give you the next item when you call `MoveNext` followed by `Current`. So all you know for sure is that the work to get you the next item is done *before* the call to `Current` returns. It might be done *long before* or *immediately before*. – Eric Lippert Apr 11 '12 at 18:04
  • 2
    @TimSchmelter: And I agree with Brian: Jon's series is highly educational. You might also want to read up on the details of how `yield return` is implemented. The idea is pretty straigthforward; the compiler transforms the code into a program that keeps track of a *number* that essentially means "what line of code was I on when I yielded most recently?", and does a "goto" to that line when `MoveNext` is called. The key to any kind of deferred execution, whether it is an iterator block or a C# 5 async block, is to *somehow keep track of what you need to do next*, aka, what is my *continuation*? – Eric Lippert Apr 11 '12 at 18:10
  • Thanks again. I've actually [problems to understand `yield`](http://stackoverflow.com/questions/3811589/translation-of-yield-into-vb-net) since i'm mainly a VB.NET developer. I'll have a further look at Jon's series. – Tim Schmelter Apr 11 '12 at 19:36
11

The Where() method uses deferred execution and will provide the next matching item as it is requested. That is, Where() does not evaluate and immediately return a sequence of all candidate objects, it provides them one at a time as they are iterated over.

Since FirstOrDefault() stops after the first item, this will cause the Where() to stop iterating as well.

Think of FirstOrDefault() as halting the execution of the Where() as if it performed a break. It's not that simple, of course, but in essence since FirstOrDefault() stops iterating once it finds an item, the Where() does not need to proceed any further.

Of course, this is in the simple case of applying a FirstOrDefault() on a Where() clause, if you have other clauses in which imply the need to consider all items, this could have an effect, but this would be true both in using Where().FirstOrDefault()' combo or justFirstOrDefault()' with a predicate.

James Michael Hare
  • 37,767
  • 9
  • 73
  • 83
  • 2
    You should probably mention somewhere in there that the order **does** matter in certain operations, as Eric does. Some cases don't, such as this one, but it's wrong to say it *never* matters. – Servy Apr 11 '12 at 17:00