1

If I have multi-columns index with column keys: col_1, col_2, and col_3

Is the query would use this index or not if it has in the WHERE clause these conditions: col_1 = any_value AND col_3 = any_value

(the second columns in the index keys was not added to WHERE-clause)

and here is another example: if the index has 10 columns and the column keys in this order: col_1, col_2, ...., col_10 and then, I have run this query: Select col_1,col_2, ..., col_10 from X WHERE col_1 = any_value AND col_5 = any_value AND col_10 = any_value

and my question: Is the index would be used in this case or not??

Moath
  • 19
  • 3
  • I think you need to better explain what your question is – Serg Chernata Jan 11 '17 at 14:06
  • Check the execution plan. The index *may* be used only for `col_1`, The optimizer will decide based on the actual data's statistics. If eg `col_1` has very few distinct values it many not make sense to use this index – Panagiotis Kanavos Jan 11 '17 at 14:41

3 Answers3

0

new answer as your question is now more clear to me

No, the index will not be used. Only when querying on col_1 OR col_1/col_2 OR col_1/col_2/col_3 the index will/may be used. Check this with the execution plan of your query. The order of your multi-column index does matter: check this question for some discussion around this topic Multiple Indexes vs Multi-Column Indexes

If you consider that it will be more likely you will query on col_1 and col_3, why not creating a multicolumn index just on those 2 columns?

Community
  • 1
  • 1
bwydoogh
  • 23
  • 2
  • 5
  • Thanks. but in the given link, what if I have "state" and "zip" in the WHERE-cluase knowing that the order of the column keys in the index as this: "state", "county", "zip". (the second column in the index was not mentioned in the WHERE-cluase) – Moath Jan 11 '17 at 14:17
  • The index *may* be used, *only* to look for `col_1` if the optimizer decides this is cheaper than using a different index – Panagiotis Kanavos Jan 11 '17 at 14:42
  • Rewrote my answer as your question is now better described. – bwydoogh Jan 11 '17 at 14:42
  • @PanagiotisKanavos True. And indeed, as you also said: the execution plan should be checked to really understand how SQL Server handles the query. – bwydoogh Jan 11 '17 at 14:44
  • Can I force the optimizer to use this index if I updated the WHERE-clause to be like this: col_1 = any_value AND (col_2 is null OR col_2 is not null) AND col_3 = any_value – Moath Jan 11 '17 at 15:04
0

It might be used. It depends on many factors, mostly your data(and statistics about your data), and your queries.

TL/DR; you need to test this on your own data and your own queries. The index might be used.

You should try it out on the data that you have or expect to have. It is very easy to create some test-data on which you can test your queries and try different indexes. You might also need to reconsider the order of the columns in the index, is col_1 really the best column to be first in the index?

Below is a very specific scenario from which we can only conclude that the index can be used, sometimes, in similar scenarios as yours.

Consider this scenario below; the table contains 1M rows and have only a single nonclustered index on (a, b, c). Note that the values in column D is very large.

The first 4 queries below used the index, only the fifth query did not.

Why?

Sql Server will need to figure out how to complete the query while reading the least amount of data. Sometimes it is easier for SQL Server to read the index instead of the table even when the query-filter does not completely match the index.

In Query 1 and 2 the query will actually do a Seek on the index which is quite good. It knows that column A is a good candidate to perform the Seek on.

In query 3 and 4 it needs to scan the entirety of the index to find the matching rows. It still used the index.

In query 5 SQL Server realizes that it is easier to scan the actual table instead of the index.

IF OBJECT_ID('tempdb..#peter') IS NOT NULL DROP TABLE #peter;
CREATE TABLE #peter(a INT, b INT, c VARCHAR(100), d VARCHAR(MAX));

WITH baserows AS (
    SELECT * FROM master..spt_values WHERE type = 'P'
),
numbered AS (
SELECT TOP 1000000 
    a.*, rn = ROW_NUMBER() OVER(ORDER BY (SELECT null))
FROM baserows a, baserows b, baserows c
)

INSERT #peter
        ( a, b, c, d )
SELECT 
    rn % 100, rn % 10, CHAR(65 + (rn % 60)), REPLICATE(CHAR(65 + (rn % 60)), rn)
FROM numbered

CREATE INDEX ix_peter ON #peter(a, b, c);



-- First query does Seek on the index + RID Lookup.
SELECT * FROM #peter WHERE a = 55 AND c = 'P'
-- Second Query does Seek on the index.
SELECT a, b, c FROM #peter WHERE a = 55 AND c = 'P'
-- Third query does Scan on the index because the index is smaller to scan than the full table.
SELECT a, b, c FROM #peter WHERE c = 'P'
-- Fourth query does a scan on the index
SELECT a, b, c FROM #peter WHERE b = 22
-- Fifth query scans the table and not the index
SELECT MAX(d) FROM #peter

Tested on SQL Server 2014.

Peter Henell
  • 2,416
  • 1
  • 17
  • 24
0

The index will definitely be used but not effectively.

I did an experiment (SQL Server) and here is how it looks [IX_AB is an index on a, b] and I can correlate your problem with it.

enter image description here

These are the conclusions

  1. If you create an index with col1, col2, and col3 and pass col1 and col3 only, the index will only filter col1 values and then data retrieved from there will be filtered programmatically O(N) where N is the records marked by the index.
  2. Passing the mid-value as "not null" or "null" does not help.
Hrishabh Gupta
  • 1,912
  • 15
  • 13