(the table is used by legacy apps so we can't change the schema itself)
We had a table courses_old
with around 1.6 million rows
mysql> describe courses_old ;
+---------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+-------+
| chapter | varchar(128) | NO | PRI | | |
| label | varchar(128) | NO | PRI | | |
| title | varchar(128) | YES | | NULL | |
| language | varchar(48) | NO | PRI | en | |
+---------------+------------------+------+-----+---------+-------+
We often do the request
SELECT DISTINCT
chapter,
title,
language
FROM courses_old;
it was quite slow, so we added an index on these 3 fields and it became faster, and the explain was like this:
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | courses_old | range | idx_c_t_l_old | idx_c_t_l_old | 919 | NULL | 65072 | Using index for group-by |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
We discovered there was a lot (50% !) of unused values, so we did the following
- renaming the table in
courses_old
- creating again the table named this time
courses_new
- doing a
insert into courses_new select * from courses_old where ...
- recreating the same index
The number of rows is now 840000 And now the explain look like this
SELECT DISTINCT
chapter,
title,
language
FROM courses_new;
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+-------------+
| 1 | SIMPLE | courses_new | index | idx_c_t_l_new | idx_c_t_l_new | 919 | NULL | 809549| Using index |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+-------------+
And the requests has gone from 0.2s
to 1s
However if now i do
SELECT DISTINCT
chapter,
title,
language
FROM courses_new
FORCE INDEX (idx_c_t_l);
the performance are back to normal, as well as the explain
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | courses_new | range | idx_c_t_l_new | idx_c_t_l_new | 919 | NULL | 101194| Using index for group-by |
+----+-------------+-------------+-------+----------------+---------------+---------+------+-------+--------------------------+
so my questions are
- why MySQL has suddenly decided to stop doing the same explain , though it's less efficient ?
- is there a way (by playing with some parameters etc.) to make MySQL come back to reason and find alone the most efficient query plan ?
Edit: some precisions:
- I've kept the two versions of the tables, so they're running exactly under the same condition
- On the new table, there's less rows than on the old one
- I've benchmarked each requests 1000 time each to be sure there were no cache causing the perf difference