5

let say I have polymorphic similar to this

| document_id | owner_type | owner_id |
| 1           | Client     | 1        |
| 1           | Client     | 2        |
| 2           | User       | 1        |

I know I'll be calling queries looking for owner_type and owner_type + owner_id

SELECT * FROM document_name_ownerships WHERE owner_type = 'Client`
SELECT * FROM document_name_ownerships WHERE owner_type = 'Client` and owner_id = 1

Lets ignore how to index document_id I would like to know what is the best way(performance) to index owner columns for this SQL scenarios

Solution 1:

CREATE INDEX do_type_id_ix ON document_ownerships (owner_type, owner_id)

this way I would have just one index that works for both scenarios

Solution 2:

CREATE INDEX do_id_type_ix ON document_ownerships (owner_id, owner_type)
CREATE INDEX do_type_ix ON document_ownerships (owner_type)

this way I would have indexes that totally match the way how I will use database. The only thing is that I have 2 indexes when I can have just one

Solution 3:

CREATE INDEX do_id_ix ON document_ownerships (owner_id)
CREATE INDEX do_type_ix ON document_ownerships (owner_type)

individual column indexes


From what I was exploring in MySQL console with explain I get really similar results and because Its a new project I don't have enought data to properly explore this so that I'll be 100% sure (even when I populated databese with several hundred records). So can anyone give me piece of advise from their experience ?

Alma Do
  • 37,009
  • 9
  • 76
  • 105
equivalent8
  • 13,754
  • 8
  • 81
  • 109
  • Why didn't you try anything? At least search on SO for such questions? It's a very, very basic question and it has [many](http://dev.mysql.com/doc/refman/5.5/en/multiple-column-indexes.html), [many](http://stackoverflow.com/questions/512909/slow-mysql-query-what-should-i-index?rq=1), [many](http://stackoverflow.com/questions/8223765/multi-column-index-performance) answers. – Alma Do Oct 21 '13 at 11:25
  • 1
    thank you for suggestion, Yes I did and although you're right there are many question "like this" they're not quite asking what I'm asking therefor I didn't found satisfying answer to which of the listed solutions in my question is the best – equivalent8 Oct 21 '13 at 11:56

1 Answers1

3

This is going to depend a lot on the distribution of your data - indexes only make sense if there is good selectivity in the indexed columns.

e.g. if there are only 2 possible values for owner_type, viz Client and User, and assuming they are distributed evenly, then any index only on owner_type will be pointless. In this case, a query like

SELECT * FROM document_name_ownerships WHERE owner_type = 'Client`;

would likely return a large percentage of the records in the table, and a scan is the best that is possible (Although I'm assuming your real queries will join to the derived tables and filter on derived table-specific columns, which would be a very different query plan to this one.)

Thus I would consider indexing

  1. Only on owner_id, assuming this gives a good degree of selectivity by itself,
  2. Or, on the combination (owner_id, owner_type) only if there is evidence that index #1 isn't selective, AND if the the combination of the 2 fields gives sufficient selectivity to warrant this the index.
Community
  • 1
  • 1
StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • really good stuff thank you a lot, For goggling people I especially suggest http://dev.mysql.com/doc/refman/5.5/en/how-to-avoid-table-scan.html found in that link you provided. – equivalent8 Oct 21 '13 at 13:26
  • and yes you're right I'll be using this table mainly as a part of join, so does that mean that when I do join SQL call like `SELECT document_names.* FROM document_names INNER JOIN document_name_ownerships ON document_name_ownerships.document_name_id = document_names.id WHERE document_name_ownerships.owner_type = 'Client' AND (document_names.id BETWEEN 1 AND 10)` the query would be narrowed down enough that index for `owner_type` would make sense ? – equivalent8 Oct 21 '13 at 13:34
  • If MySql plays along nicely, it will FIRST apply selective filters on the derived tables before joining to the base table and applying further filters on a much smaller dataset. It is also important to ensure that the join key(s) in the derived table back to the base table is indexed (as you would normally do with foreign keys) – StuartLC Oct 21 '13 at 13:39
  • so in this case you mean `document_name_ownerships.document_name_id` I guess. Well thank you a lot Stuart, you really helped me here. – equivalent8 Oct 21 '13 at 13:53