0

I understand that a multi-column index is required and that index order matters in requests such as

    SELECT product_id 
      FROM order_detail 
     WHERE order_id in (1, 2, 3)

However, I can't figure out the necessity of multi-column indexes and index order when there is no value in the WHERE clause.

  1. I believe multi-column index is only used when a value is specified in the WHERE clause (I may be wrong). Therefore, in order to "optimize" the table order_detail and for the request below, should I create a multi-column index on product_id and order_id or should I create two separate indexes, one on order_id and one on product_id:

    SELECT O.order_id, P.product_name 
      FROM order_detail OD, order O, product P 
     WHERE OD.order_id = O.order_id 
       AND P.product_id = OD.product_id    
    
  2. If a multi-column index is still required when search is proceeded on two columns in the WHERE clause even if there is no value in it, what is the proper order to give to the multi column index then? Regarding the example above for the table order_detail, should I use (order_id, product_id) or should I use (product_id, order_id)? Does the order matter when there is no value in the WHERE clause?

Many thanks for your help.

Vincent
  • 1,651
  • 9
  • 28
  • 54

3 Answers3

1

The order of columns matter. MySQL can use the index (product_id, order_id) for the following:

  • WHERE product_id = 123
  • WHERE product_id = 123 AND order_id = 456

But not for:

  • WHERE order_id = 123

It is also worth noting that MySQL can choose NOT to use the index. For example if there are only 10 distinct products in 1000 orders then MySQL could ignore the index.

I would suggest placing columns with higher cardinality first. Or better, create both indexes (product_id, order_id and order_id, product_id) and let MySQL decide which one to use.

Salman A
  • 262,204
  • 82
  • 430
  • 521
0
  1. Only one index can be used when optimizing the WHERE clause. So if you're testing multiple columns, and you want the best performance, you should use a multi-column index that includes all (or most) of them. If you create two separate indexes, it will just use one of them to optimize that part of the WHERE condition, and then it will have to do scans within those rows to match the other condition.

  2. A multi-column index can be used as an index for any set of columns in a prefix. So if you have an index on (col1, col2, col3), it acts as an index on col1 and (col1, col2) as well. If you have an index on (order_id, product_id), it can be used to optimize the first query as well. If you also want to be able to optimize queries that just use product_id, create an additional index on that column. Conversely, if you have an index on (product_id, order_id), you'll need a separate index on order_id to optimize the first query.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thank you. Regarding the second query, if I have an index on (product_id, order_id), will it act as the same as (order_id, product_id) ? Does the order in the index matters if there is no value in the WHERE clause? – Vincent Mar 03 '15 at 09:08
  • I thought I was clear, the order matters. A multi-column index is also an index on the prefix columns. It's not an index on any columns that are not a prefix. So `(product_id, order_id)` is not an index on the `order_id` column by itself, but it IS an index on the `product_id` column by itself. – Barmar Mar 03 '15 at 09:11
  • I'm not sure what you mean by "no value in the WHERE" clause. In your second query, you have values in the `WHERE` clause, they're the columns from the table you're joining with. Indexes are used to optimize the joining strategy. – Barmar Mar 03 '15 at 09:13
0

Why are you preoccupied with whether there are literal values for columns given in the WHERE clause? For that matter, why are you preoccupied with the WHERE clause at all? Your second query can be rewritten without a WHERE clause and will still give the same results:

SELECT
    O.order_id,
    P.product_name
FROM
    order_detail AS OD
    JOIN order AS O ON OD.order_id = O.order_id
    JOIN product AS P ON P.product_id = OD.product_id

It doesn't especially matter whether the engine is looking for a literal value specified in the query, or whether it's looking for a value from a row in a joined table. It will attempt to choose a strategy that makes the process of joining tables efficient.

Hammerite
  • 21,755
  • 6
  • 70
  • 91
  • Thank you, but in order_detail, should I create a multicolumn index or 2 separate indexes (one on product_id and one one order_id) ? – Vincent Mar 03 '15 at 09:16