2

I've come across a weird situation where including a value in my select list determines whether or not my index is used in the lookup.

I have an index created on cTable(cf1,cf2,cf3).

In this first case, a table scan is performed on cTable:

select
    a.bkey
    ,c.mappedvalue
from
    aTable a
    LEFT JOIN bTable b
       ON b.bkey = a.bkey
    LEFT JOIN cTable c ON (
        c.[cf1] = b.[cf1] and
        c.[cf2] = b.[cf2] and
        c.[cf3] = a.[cf3]
    )
;

However when I remove the mappedvalue column from the select list, the index is used:

select
    a.bkey
    --,c.mappedvalue
from
    aTable a
    LEFT JOIN bTable b
       ON b.bkey = a.bkey
    LEFT JOIN cTable c ON (
        c.[cf1] = b.[cf1] and
        c.[cf2] = b.[cf2] and
        c.[cf3] = a.[cf3]
    )
;

Has anyone come across this? Is the optimizer just deciding to avoid the index?

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
Jamie
  • 168
  • 10
  • 1
    To your last question, yes it is choosing to ignore it. What are the indexes on a and b tables? – S3S Aug 31 '18 at 18:59
  • As of now, `a` and `b` do not have indexes on the corresponding `c` fields. I was wondering about that, but `a` is an enormous table, so I need to be certain before requesting an index. – Jamie Aug 31 '18 at 19:38
  • Up voted, not so much for the question, but for starting and participating in a good learning conversation. – Eric Brandt Aug 31 '18 at 20:09
  • Tip: For database performance questions it's helpful to include the actual execution plan(s). A handy tool is [Paste The Plan](https://www.brentozar.com/pastetheplan/instructions/). – HABO Aug 31 '18 at 21:59

1 Answers1

2

When you remove column then your index is covering index.

select
    a.bkey
    --,c.mappedvalue
from
    aTable a
    LEFT JOIN bTable b
       ON b.bkey = a.bkey
    LEFT JOIN cTable c ON (
        c.[cf1] = b.[cf1] and
        c.[cf2] = b.[cf2] and
        c.[cf3] = a.[cf3]
    )
;

You could add INCLUDE:

 CREATE INDEX idx ON cTable(cf1,cf2,cf3) INCLUDE (mappedvalue);

Then the query:

select
    a.bkey
    ,c.mappedvalue
from
    aTable a
    LEFT JOIN bTable b
       ON b.bkey = a.bkey
    LEFT JOIN cTable c ON (
        c.[cf1] = b.[cf1] and
        c.[cf2] = b.[cf2] and
        c.[cf3] = a.[cf3]
    )
;

should use index idx.

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • This doesn't make sense to me. So for any regular table, with a single primary key column, the index would be ignored if you select anything from that table other than the primary key? – Jamie Aug 31 '18 at 19:41
  • @Jaime No, it does not mean that. It will use index for fast row location and then key lookup (with nested loops) to get additional columns that are not within index. In your case query optimizer may calculate that it is easier to read directly from table than in index (you don't have any kind of `WHERE` clause) and probably based on cardinality it will return more than 20% of entire table. – Lukasz Szozda Aug 31 '18 at 19:43
  • When you have `SELECT * FROM tab WHERE pk_col = ?` then it will use index seek on pk_col and key_lookup for rest of column. When you have `SELECT * FROM tab` there is no point of reading index and then table because you could read table directly. – Lukasz Szozda Aug 31 '18 at 19:47
  • Ok. I'm starting to get it. A majority of the `c` table is likely being returned in the query, though it is somewhat limited by the contents of `a` and `b`. If I added a restriction on `b` that in turn restricted the number of mapped values coming back to ~5%, would it be likely to use the index then? They query above is actually used as a view, so restrictions will happen later on. – Jamie Aug 31 '18 at 19:57
  • Still depends. Even if it is view `SELECT * FROM (your_view_that_wraps_query) WHERE ...` depending on where condition and if query optimizer decides to do [`predicate pushdown`](https://blogs.msdn.microsoft.com/sql_server_team/predicate-pushdown-and-why-should-i-care/). – Lukasz Szozda Aug 31 '18 at 19:58
  • Thanks. So do you think a covered index or using include would help performance in this situation? Or am I safe to leave it up to the optimizer? All queries on `c` will be returning the `mappedvalue` field. Your `include` index makes sense to me. – Jamie Aug 31 '18 at 20:04
  • @Jaimie Covered index with `INCLUDE` will help for that specified query. – Lukasz Szozda Aug 31 '18 at 20:04