9

I understand that executing operations in different orders will yield different performance such as the difference between the following slow query:

List<TestItem> slowResults = items.OrderBy(item => item.StringItem)
                                  .Where(item => item.IntItem == 100)
                                  .ToList();

and this faster one:

List<TestItem> fastResults = items.Where(item => item.IntItem == 100)
                                  .OrderBy(item => item.StringItem)
                                  .ToList();

But that is not my question:

My question is about performance of the short circuiting as it relates to a LINQ predicate. When I use a Where clause, like in this case:

List<TestItem> results = items.Where(item => item.Item1 == 12 &&
                                             item.Item2 != null &&
                                             item.Item2.SubItem == 65 &&
                                             item.Item3.Equals(anotherThingy))
                              .ToList();

Doesn't the order of the arguments matter? For instance, I would expect that doing a .Equals first would result in a slower query overall due to the Item1 == 12 integer evaluation being a much faster operation?

If order does matter, how much does it matter? Of course, calling methods like .Equals probably results is a much larger slow-down than if I was just comparing a few integers, but is it a relatively small performance penalty compared to how 'slow' LINQ operates? As LINQ makes tons of method calls, is something like .Equals really going to matter since--unless its overridden--it'll be executing native framework code, right? On the other hand, is a standard MSIL method call going to be significantly slower?

Also, are there any other compiler optimizations on this query which might be speeding this up under the hood?

Thanks for the thoughts and clarification! Brett

Brett
  • 4,066
  • 8
  • 36
  • 50
  • I was thinking LINQ to Objects, but I suppose the question is even more important for LINQ to SQL. – Brett Feb 24 '12 at 19:11
  • 2
    Although I don't know LINQ____, if it's like other short-circuited languages, the order matters, because if the expression to be evaluated ever becomes definitely true or definitely false, the rest of the predicates can be safely discarded. For example, in the expression `(1 == 1 || x == 3)`, `x==3` will never be evaluated because no matter what the result is, the expression will be true. A similar simple example might be `(1==0 && x == 3)`, where after `1==0` is found to be false, it just quits because there's no way the expression can be true. Sorry if this isn't what you're looking for! – prelic Feb 24 '12 at 19:13
  • Thats what I was thinking, @prelic! But I was also wondering about any compiler optimizations that might be taking place as well, or if queries including method calls might be optimized? – Brett Feb 24 '12 at 19:16
  • @Brett this is just a pure guess, but I would assume there are no DB query optimizations because what happens if someone else is externally connected to, and changing, the DB? I'm guessing each query, even if previously executed, is executed again to maintain transactional assurances, whereas a self-contained application might be able to make more assumptions, but like I said, that is purely a guess with no factual basis. The queries themselves can be optimized to be better formed (faster) queries, but I doubt there's any kind of caching or memoization going on behind the curtains. – prelic Feb 24 '12 at 19:19

1 Answers1

13

The answer is going to be different for different LINQ providers. In particular, the story is very different for LINQ to Objects and say LINQ to Entities.

In LINQ to Objects, the Where operator accepts the filter as Func<TSource, bool>. Func<,> is a delegate, so for the purposes of this discussion, you can think of it as a function pointer. In LINQ to Objects, your query is equivalent to this:

static void Main() {
    List<TestItem> results = items.Where(MyFilter).ToList(); 

static boolean MyFilter(TestItem item) {
    return item.Item1 == 12 && 
        item.Item2 != null && 
        item.Item2.SubItem == 65 && 
        item.Item3.Equals(anotherThingy)
}

The main thing to notice is that MyFilter is an ordinary C# method and so ordinary C# rules apply, including the short-circuiting behavior of &&. Consequently, the conditions will be evaluated in the order you wrote them. LINQ to Objects can invoke MyFilter on different input elements, but it cannot change what MyFilter does.

In LINQ to Entities and LINQ to SQL, the Where operator accepts the filter as Expression<Func<TSource, bool>>. Now, the filter is passed into the Where operator as a data structure that describes the expression. In that case, the LINQ provider will look at the data structure (the "expression tree") and it is up to the LINQ provider to decide how to interpret it.

In LINQ to Entities and LINQ to SQL cases, the expression tree will be translated to SQL. And then it is up to the database server to decide how to execute the query. The server is definitely allowed to reorder the conditions, and it may do even more substantial optimizations. For example, if the SQL table contains an index on one of the columns referenced in the condition, the server can choose to use the index and avoid even looking at rows that don't match that particular condition part.

Igor ostrovsky
  • 7,282
  • 2
  • 29
  • 28