7

Take a look at this fiddle: http://sqlfiddle.com/#!6/18324/2

Expand the very first execution plan, for the queries against view B.
Notice that the first query executes using index seek, while the second one - using index scan. In my real setup, with thousands of rows, this produces a performance hit that is quite considerable.

WTF???

The queries are equivalent, aren't they? Why does a literal produce seek and a variable - scan?
But more importantly: how can I work around this?

This post comes closest to the problem, and the solution that works from there is using option(recompile) (thank you, Martin Smith). However, that does not work for me, because my queries are being generated by my ORM library (which is Entity Framework) and I cannot amend them manually.
Rather what I'm looking for is a way to reformulate the B view so that the problem would not occur.

While fiddling with this problem, I have noticed that it is always the "Segment" block in the execution plan that loses the predicate. To verify this, I reformulated the query in terms of a subquery with min function (see view D). And voila! - both queries against the D view produce identical plans.

The bad news, however, is that I cannot use this min-powered trick, because in my real setup, the column Y is actually several columns, so that I can order by them, but I cannot take a min() of them.
So the second question would be: can anyone come up with a trick that is similar to min-powered subquery, but works for several columns?

NOTE 1: this is definitely not related to the tipping point, because there are just 2 records in the table.
NOTE 2: it also doesn't have to do with the presence of a view. See an example with view C: the server is happily using seek in that case.

Community
  • 1
  • 1
Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172
  • 2
    On my local 2008 R2 instance `select * from B where X = @a option (recompile)` produces a seek. – Martin Smith Oct 19 '12 at 14:40
  • @Martin Smith: Right. Missed that. Thank you. Unfortunately, however, this doesn't quite work. See my update to the question. – Fyodor Soikin Oct 19 '12 at 15:04
  • Quite likely a duplicate: http://dba.stackexchange.com/questions/12498/window-functions-cause-awful-execution-plan-when-called-from-a-view-with-externa – GSerg Oct 19 '12 at 15:42

3 Answers3

0

May be this one will work

select a.X, a.Y from A a
    cross apply
        (select top 1 * from A t where t.X = a.X order by t.Y asc) as idx

SQLFiddle http://sqlfiddle.com/#!6/a3362/2

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • Tried that one. True, it does produce a seek. But it produces TWO of them, and then a nested loops join between those two. – Fyodor Soikin Oct 19 '12 at 16:43
  • have you tried that one in real environment? It should be almost as fast as min-trick – Roman Pekar Oct 19 '12 at 17:16
  • No, I did not try it in a real environment, but I don't see how it's supposed to be almost as fast. It's doing two seeks instead of one plus an extra join, so it's supposed to be at least twice as complex, isn't it? – Fyodor Soikin Oct 20 '12 at 18:13
  • It's still not table scan :) I've tried to find some solution to get it in one index seek, but no solution still. For practical purposes, I think cross apply with top 1 will be ok. I'm working with million rows with such a query, and it works blazingly fast if you have right indexes. – Roman Pekar Oct 20 '12 at 18:30
0

The queries do produce equivalent output, but in the eyes of the sql optimizer they are different. The article recommends looking at the OPTION clause (unfortunately not included before SQL 2005).

You can Brew Your Own Query on top of the Entity Framework, which may be your best bet to achieve the desired performance.

Ryan Gates
  • 4,501
  • 6
  • 50
  • 90
  • 2
    When SQL server need to compare two values of different types, it uses rules to know how to make cast. http://msdn.microsoft.com/en-us/library/ms190309.aspx. When SQL server compares varchar and uniqueidentifier, it casts varchar to uniqueidentifier, so basically when you write `X = '40DB7DE2-EEFA-4D31-B400-7E72AB34DE99'` and X is uniqueidentifier, it meand `X = cast('40DB7DE2-EEFA-4D31-B400-7E72AB34DE99' as uniqueidentifier)` Here's SQLFiddle with two queries - one with varchar and one with uniqueidentifier - the plans are equal http://sqlfiddle.com/#!6/18324/19 – Roman Pekar Oct 19 '12 at 20:13
  • 2
    @RyanGates: Firstly, the notion that "there is a conversion for each record" simply isn't true. Secondly, it is actually the first query (the one with literal) that performs the way I want it to perform, while the second query (with variable) shows worse performance. – Fyodor Soikin Oct 20 '12 at 18:16
0

Here is my own answer.

Ultimately, I used the min-powered trick, and I got around the fact that Y is actually several columns by converting those columns into constant-length string representations (carefully tuned for sorting) and joining those strings together into one string. After that is done, I am able to use that joined string as the argument for min().

I still would like to know the proper way of doing this. If anyone happens to know it, I'd appreciate.

Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172