2

I have a table that filters data on three columns: Country, State, City.

The queries in my application get more selective, but always include the more general column(s). For example, the WHERE clause would look like

WHERE Country = 'Canada'
WHERE Country = 'Canada' AND State = 'Ontario'
WHERE Country = 'Canada' AND State = 'Ontario' AND City = 'Toronto'

But wouldn't have a WHERE clause on just the City column or just the State column.

I'm tasked with creating an index for this table. And I understand that the column order matters in creating an index, but was confused on what would be more performant.

Here's the two options I'm thinking of:

Option 1:

(Country, State, City)

Option 2:

(City, State, Country)
(State, Country)
(Country)

The first option covers all the queries in one index. But the second option (I think) would perform better. What I wanted to ask is if there is an actual performance benefit in using Option 2?

Thoughts?

Sorry, if I didn't make much sense. Let me know if clarification is required. And Thank you!

RaziAbbasi
  • 23
  • 5

1 Answers1

3

Definitely go for Option 1.

This allows the exactly correct rows to be seeked (I.e. no more and no less) for all three of your identified query patterns.

There is no advantage of having indexes ordered with most selective column first if you never query based on that column on its own. The selectivity of that single column makes no difference for queries seeking on multiple columns.

The "most selective first" trope is an area quite misunderstood in the SQL Server community. I go into more detail here.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • Based on the accepted answer to this SO question: https://stackoverflow.com/questions/2292662/how-important-is-the-order-of-columns-in-indexes, I thought having the city first would improve speed of the lookup (since we have fewer rows for a city than for a Country). I was going with Option 1 initially, but after reading the accepted answer there I wanted clarification. – RaziAbbasi Jan 07 '22 at 23:47
  • Just saw your edit. You mentioned the same SO question that confused me. I'll read up on that. Thank you again! – RaziAbbasi Jan 07 '22 at 23:49
  • I already commented on the accepted answer there some time ago complaining it is misleading and submitted a more accurate answer. To my mind the accepted answer there sounds like it is teaching you something useful but when you dig into it is nonsense. Searching for `'Canada', 'Ontario', 'Toronto'` and `'Toronto', 'Ontario', 'Canada'` makes no difference. In both cases you are effectively searching for a key that is a concatenation of all three values. – Martin Smith Jan 07 '22 at 23:50
  • But anyway - you don't need to wonder who to believe on this. Create a table with dummy data and try `(City, State, Country)` vs `(Country, State, City)` . Look at "number of rows read" in the execution plans and logical reads (the other two indexes you propose could have lower logical reads just because they have fewer key columns so take less space but unlikely to be worth having multiple indexes when one can do the job) – Martin Smith Jan 08 '22 at 00:00