Why would Sql Server prefer a Clustered Index Scan over NonClustered + KeyLookup?

Question

So here's the query and the execution plan. The commented create-index statement has already been executed. enter image description here

Why does Sql-Server decide to perform a clustered index scan? Am I missing something?

You are selecting `*`. Does the index cover all columns? How many rows in the table? Please supply the `CREATE TABLE` and the query as text not as a screenshot. — Martin Smith, Jun 19 '13 at 14:30
@MartinSmith, no, index does not cover all the columns off course, that's why I was expecting a key lookup. — Artur Udod, Jun 19 '13 at 19:47
But know I got it, it's because sql-server's query optimizer estimates the effort based on the number pages to be scanned and my table has not that much data. — Artur Udod, Jun 19 '13 at 19:49
Yes there is a tipping point between the two strategies based on estimated cost of the two options. There is a third option as well applicable to large tables which is to sort the keys first so the key lookups occur in sequential order. — Martin Smith, Jun 19 '13 at 19:53

score 3 · Answer 1 · edited May 23 '17 at 11:53

3

You question seems similar to this one - Why is there a scan on my clustered index?

Also check the blog mentioned in the answer for a good overview on this subject - http://www.sqlskills.com/blogs/kimberly/the-tipping-point-query-answers/

edited May 23 '17 at 11:53

Community

1
1

answered Jun 19 '13 at 14:32

arun_bh

31
2

1

Thank you for the link, the article is really useful. – Artur Udod Jun 19 '13 at 20:06

score 3 · Accepted Answer · answered Jun 19 '13 at 16:13

If the query would use the index, it would have to search through the index pages (at least 2), get the clustered index key from the result(s), then search with each of these keys through the clustered index (at least 2 page reads per found record) to get the rest of the record not covered by the index.

This can be an effective approach, if you have a large number of records and your where clause selects only a relative small subset of them (which the query optimizer estimates based on the index statistics).

From the looks of it you only have a small number of records in the table, maybe they even fit on one page, so the optimizers says "I can do the whole query by reading and filtering the one or two pages from the clustered index once, which is way more effective than the whole nonclustered index business in this case, so I'll do that and spare my poor overloaded server engine a whole load of trouble" :-)

Edit: Try to specify WITH (INDEX(idx_WP_Discrepancy_FilterColumns2)) before the WHERE clause and compare the estimated subtree costs in the query plan to the original query to see the difference.

score 0 · Answer 3 · answered Jun 19 '13 at 14:27

0

The query plan might be cached. Add:

option (recompile)

at the end of your query to force a fresh compilation.

answered Jun 19 '13 at 14:27

Andomar

232,371
49
380
404

Adding a new index will cause any dependant plans to be automatically dropped. – Martin Smith Jun 19 '13 at 14:29
@MartinSmith: [MSDN](http://msdn.microsoft.com/en-us/library/ms181055(v=sql.105).aspx) suggests otherwise. Adding an index to a table referenced in the query is not in the list of actions that invalidate a plan. – Andomar Jun 19 '13 at 14:30
Probably not a comprehensive list. Pretty certain you will get an optimality based recompile. Will try and find a source. – Martin Smith Jun 19 '13 at 14:33
@MartinSmith: My tests show that the query plan survives the addition of a relevant index, see [Pastebin](http://pastebin.com/pCkCEWC9) – Andomar Jun 19 '13 at 14:38
[Try this one](http://pastebin.com/C4czPJ0L) with the include execution plan enabled. You see the plan does change on the second query. (Edit pastebin link amended to put second query in new batch) – Martin Smith Jun 19 '13 at 14:44
@MartinSmith: You forgot the `go` statements around the second `select`. If you add those, you'll see that the second execution of the `select` does not use the newly created index. – Andomar Jun 19 '13 at 14:49
I have added the `go`s (see edit to previous comment) and I do see the newly created index used. – Martin Smith Jun 19 '13 at 14:50
Do you mean the other way round? My pastebin does use `col1`. The reason the plan doesn't change with `SELECT *` is because it is still the optimal plan even after the index is added. Using the index would require a lookup operator to be added to the plan. For the estimated number of rows the table scan is costed less. `SELECT * FROM PlanTest WHERE col1 = 1 OPTION (RECOMPILE);` still gives the table scan plan. – Martin Smith Jun 19 '13 at 14:54
@MartinSmith: Alright, it looks like the plan is invalidated when you re-run the query, not when you create the index. My test case didn't actually re-run the query, it merely checked if the query plan survived the index creation. Your first comment is materially correct. – Andomar Jun 19 '13 at 15:03
My comment is of course not actually correct though as agree it seems it doesn't actually drop it at the time but will cause a recompile next time it is executed. – Martin Smith Jun 19 '13 at 15:05
thank you guys, learned some interesting stuff from your discussion – Artur Udod Jun 19 '13 at 19:44

Why would Sql Server prefer a Clustered Index Scan over NonClustered + KeyLookup?

3 Answers3