17

I'm trying to understand what is better when defining multi-column indexes:

  • Putting the most selective column first (higher cardinality, for speed?); or
  • Putting the less selective column first (Lower cardinality, for index compression?)

Or maybe it depends if I'm optimizing for speed or space?

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
sbargay
  • 171
  • 1
  • 3
  • Your index order is defined by what queries the index is meant to serve. Once you define the indexes by the queries, you DON'T have a choice in what order they appear without making the index unusable for some of the queries. – Dan Grossman Jan 24 '11 at 00:19
  • Again, I understand that, but between the two options that serve the queries - which is better, taking the huge difference in cardinality? – sbargay Jan 24 '11 at 00:30
  • 1
    @sbargay Did you ever figure out the answer to this question? I know what you are asking. If you have country_id and person_id, is it better to setup the index country_id, person_id, or is it better to set the higher cardinality first with person_id, country_id. It appears you obviously know how to use indexes properly in your queries, but I had the same question as you. – n0nag0n Sep 17 '14 at 22:58

3 Answers3

9

The order of the columns should match the order in which the columns are queried later or MySQL will not use them. This is the question you should really think about.

Read more here.

UPDATE:

For your question about cardinality maybe read this. Is this similar to your question? Does it answer it?

Community
  • 1
  • 1
FabianB
  • 273
  • 1
  • 6
  • I take care of the sequence (left most can be used for single column queries etc), but still - I have the flexibility to decide who is first and who is second. I'm looking for a hint to select which goes where. – sbargay Jan 24 '11 at 00:03
  • Did you read the Link? Either I do not understand the question or the answer is given and is "in the order they are typically queried". – FabianB Jan 24 '11 at 00:07
  • Yes, I read it: e.g. index on columns a=cardinality=23, b=cardinality=1000000, c=cardinality=500000. My queries always needs column a,b or a,b,c . – sbargay Jan 24 '11 at 00:12
  • Then your index must be on (a,b,c) if you want both queries to be able to use the index for all 2/3 columns. – Dan Grossman Jan 24 '11 at 00:17
  • [This Enter thing got this comment in the middle] Yes, I read it: e.g. index on columns a=cardinality=23, b=cardinality=1000000, c=cardinality=500000. My queries need columns a, a,c or a,b,c. I can basically create index (c,a,b)+(a) or (a,b,c)+(a,c) – sbargay Jan 24 '11 at 00:20
  • "which the columns are queried later" -- is rather cryptic. – Rick James Jul 11 '18 at 17:52
4

Always put the most selective column in the beginning, there is very rarely a reason for the other way around.

or maybe it depends if I'm optimizing for speed or space?

Let me put it this way. What is the point of using less storage, if it causes the index not to be used at all? A low cardinality index (going in column order) will normally not be used if it is not a covering index for the query, because it will be terribly expensive to go back to the data for other columns.

The point of indexes is to assist the query and having them in the right order (cardinality) should always be the first and foremost consideration.

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • 2
    Ok - say that I save country id (low cardinality) and person id (high cardinality). same person id can exist in several countries. Should I put country_id, person_id or vice versa? I sometimes need to retrieve all people from a country, and sometimes reach all the people with the same person_id – sbargay Feb 12 '11 at 22:21
0
WHERE person_id = 123 AND country_code = 'AT'

Use

INDEX(person_id, country_code)  -- in EITHER order!

There is no difference in speed or space for the order of the index columns in this case.

Yeah, MyISAM had "index compression", but that is not used anymore.

Cardinality only matters for comparing separate indexes, not for ordering columns in a composite index. That is,

INDEX(person_id)  -- is better than
INDEX(country_code)

But neither is as good as the composite index.

For

WHERE person_name LIKE 'James%' AND country_code = 'UK'

the best index is

INDEX(country_code, person_name)   -- in THIS order!

The order in the WHERE has no impact on optimization.

More tips and discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Rick James
  • 135,179
  • 13
  • 127
  • 222