Why did MySQL change which indexes to use for the same query?

Question

I am posting this as a new question (from another question I asked here: How to efficiently select and group by a substring of a column), since I think the new problem that arose warrants its own post. I'm not sure if that was the right thing to do, but please let know if it wasn't.

I currently have 3 (simplified) tables on a MySQL (V5.5.14), InnoDB (V1.1.8):

1) There are actually four different versions of this table, where each store data of a certain type, but all have these 3 columns

+--------------------------------------------------+
|  PropData                                        |
+--------------+-----------+-----------------------+
|  Id          |  BigTag   |  Date                 |
+--------------+-----------+-----------------------+
|  [SomeGUID]  |  10001AB  |  1000-01-01 00:00:00  |
+--------------+-----------+-----------------------+

2) The records in this table mirrors the data in the above table, except it only contains the primary key (id) and the first 5 chars of the BigTag (to easily join with the LimitTags). There is one of these tables for each respective PropData table

+------------------------+
|  PropDataTag           |
+--------------+---------+
|  Id          |  Tag    |
+--------------+---------+
|  [SomeGUID]  |  10001  |
+--------------+---------+

3) There is only one version of this table

+-------------------+
|  LimitTags        |
+---------+---------+
|  Tag    |  Model  |
+---------+---------+
|  10001  |  Base   |
+---------+---------+

I am basically trying to get all unique pairs of tags and models for a range of data between some time span.

With these three tables, I eventually came up with the following query:

SELECT DISTINCT T.Tag, T.Model
FROM PropData P
   JOIN PropDataTag N ON P.Id=N.Id
      JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

The results look something like this:

+---------+----------+
|  Tag    |  Model   |
+---------+----------+
|  10001  |  Base    |
|  10002  |  Base    |
|  10003  |  Base    |
|  10004  |  Base    |
|  10001  |  Upgrade |
|  10002  |  Upgrade |
|  10001  |  Crappy  |
+---------+----------+

I have the following indexes:

1) PropData: PRIMARY(Id), IdxDate(Date), IdxTag(BigTag), IdxIdAndDate(Id, Date)

2) PropDataTag: PRIMARY(Id), IdxTag(Tag)

3) LimitTags: PRIMARY(Id), IdxTag(Tag), IdxTagAndModel(Tag, Model)

The first time I ran it, it went perfectly and I got my results (696 records) in 0.016 sec. I also ran the EXPLAIN command and got the following results. When I took the screenshot, I unfortunately neglected to expand the ref column so I don't know what two of the values are and am having trouble reproducing these results.

id  select_type  table  type    possible_keys                 key             key_len   ref     rows     Extra 
1   SIMPLE       T      index   IdxTag,IdxTagAndModel         IdxTagAndModel  49        NULL    1427     Using index; Using temporary
1   SIMPLE       N      ref     IdxTag                        IdxTag          7         NoIdea  1238     Using index; Distinct
1   SIMPLE       P      eq_ref  PRIMARY,IdxDate,IdxIdAndDate  IdxDate         38        NoIdea  1        Using where; Distinct

Thinking everything was well in hand, I then tried the exact same query on the all 4 PropData tables and UNIONed them together to get a complete list of the tag/model pairs. After running the query, it took more than 2 minutes before I stopped it for taking too long. I tried running the explain command on the original query shown above, using the exact same tables, and instead of getting the same results, I got the following:

id  select_type  table  type    possible_keys                 key         key_len   ref    rows     Extra 
1   SIMPLE       P      range   PRIMARY,IdxDate,IdxIdAndDate  IdxDate     24        NULL   1785585  Using where; Using Index; Using temporary
1   SIMPLE       N      eq_ref  PRIMARY,IdxTag                PRIMARY     38        P.Id   1
1   SIMPLE       T      ref     IdxTag,IdxTagAndModel         IdxTag      7         N.Tag  1

Running the original query now takes > 30 sec instead of ~0.016 sec. The data and the indexes on these tables have definitely not changed, and I ran the explain command for these queries about 5 minutes apart.

What just happened? I can't figure out:

Why did the order of the rows in the explain output change?
Why did MySQL decide to use different indices all of a sudden?

Does anyone have any ideas or input? I searched around for other posts, but no one seems to have experienced the following results.

Edit 1:

I was able to reproduce this once when I tried to cancel the query (without explain) mid execution, which caused MySQL Workbench to crash. On restart, it worked the first time, giving me the results instantly. When I ran the query with all 4 tables, the indexes switched again and I experienced the same phenomenon as above, but with a new, different set of EXPLAIN results:

id  select_type  table  type    possible_keys                 key             key_len   ref    rows     Extra 
1   SIMPLE       P      range   PRIMARY,IdxDate,IdxIdAndDate  IdxDate         24        NULL   1796958  Using where; Using Index; Using temporary
1   SIMPLE       N      eq_ref  PRIMARY,IdxTag                PRIMARY         38        P.Id   1
1   SIMPLE       T      ref     IdxTag,IdxTagAndModel         IdxTagAndModel  7         N.Tag  1        Using index

I tried forcing the query to use the same indexes that were shown in the desired first set of EXPLAIN outputs:

SELECT DISTINCT T.Tag, T.Model
FROM PropData P FORCE INDEX (PRIMARY)
   JOIN PropDataTag N FORCE INDEX (IdxTag) ON P.Id=N.Id
      JOIN LimitTags T FORCE INDEX (IdxTagAndModel) ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

And I got these results from explain:

id  select_type  table  type    possible_keys   key             key_len   ref    rows     Extra 
1   SIMPLE       N      index   IdxTag          PRIMARY         7         P.Id   1        Using index; Using temporary
1   SIMPLE       T      ref     IdxTagAndModel  IdxTagAndModel  7         N.Tag  1        Using index
1   SIMPLE       P      eq_ref  PRIMARY         IdxDate         38        NULL   1796958  Using where; Distinct

The main differences I see between these latest results and the original working version is that the IdxTagAndModel key only has a key_len of 7 rather than 49, and that table N doesn't have Distinct in the extra column.

Other differences to note are that the number of rows is different for table P, and that Using Temporary is on table N instead of T.

Edit 2:

Here is the full query I perform that seemed to switch which indexes were used:

SELECT DISTINCT T.Tag, T.Model
FROM PropData1 P
   JOIN PropDataTag1 N ON P.Id=N.Id
      JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData2 P
   JOIN PropDataTag2 N ON P.Id=N.Id
      JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData3 P
   JOIN PropDataTag3 N ON P.Id=N.Id
      JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData4 P
   JOIN PropDataTag4 N ON P.Id=N.Id
      JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'

I didn't include it initially because it's literally the same query repeated 3 times on different tables. Each table holds a different kind of data such as a double, or a BLOB, but they aren't used at all in this query.

`Explain` is voodoo guess work. Don't take it too seriously. If you want ordering, then say so in the query. — Drew, Jul 12 '16 at 18:42
You are only showing two queries here in this as I can see. There is no `ORDER BY` clause. So the optimizer uses what it has for speed, and delivers your results, in the order it chooses. Because you didn't — Drew, Jul 12 '16 at 18:52
I used to order my result by date time, but I removed it a while ago for simplicity. I still don't understand why the optimizer would keep changing the order and the indexes when none of the data or indexes have changed. — Aftab Narsimhan, Jul 12 '16 at 18:59
Are you talking about the ordering as given to you in the Result Set or the ordering of what Explain dumps out in its 3 line table — Drew, Jul 12 '16 at 19:00
The server is delivering to you exactly the data you asked for: un-ordered data. The fact that it had one set of indices cached or not at any point in time and not 10 minutes later (thus deciding a different way of giving you un-ordered data) is simply what happened. You got what you asked for, and it is simply a misunderstanding of the server concepts on your part. — Drew, Jul 12 '16 at 19:03
I'm talking about what explain dumps out. And fair enough, I'm still pretty new to SQL in general. Could you point out which server concepts I should look into specifically? — Aftab Narsimhan, Jul 12 '16 at 19:11
_"Running the query without explain now takes > 30 sec instead of < 0.1 sec"_ barring possible caching benefits, running the query with the explain should always take less time than without; as running it with the explain is not actually running the query. — Uueerdo, Jul 12 '16 at 20:06
@Uueerdo You misunderstand. The query that is shown above was first ran as is, and returned my 696 results in 0.016 sec. It then later ran in 31 sec. I am not referring to the timing of the explain command. — Aftab Narsimhan, Jul 12 '16 at 20:36
but you haven't shown the second query, the one that involves UNION. that's probably why you are attracting down votes — e4c5, Jul 13 '16 at 00:30
Can you check if you mix unicode and non-unicode varchar-fields for your guids? (Or don't these guids have a constant length?). And I guess you need an index `PropDataTag(id, tag)`. (Assuming your columns are just a subset of your actual table). (That is not directly related to the question why a query for 10million rows needs more time than a query for 1000 rows, just an attempt to understand your 10million-row-query - it still should not take 30s). And btw, do not try to force away `IdxDate`, that index is important (and mysql uses it anyway for good reason). — Solarflare, Jul 13 '16 at 09:16
@Solarflare The guids are all char(38). And I just ran the third query from the answer [here](http://stackoverflow.com/questions/6103595/how-to-check-if-mysql-table-is-utf-8-and-has-storageengine-innodb) and the result is that all tables I'm operating with are latin1. I also tried adding the index you suggested but it change anything. Lastly, I'm not entirely sure about that index being super important. If I'm doing a scan of a database where I get all the records where the datetime is between two values, wouldn't it have to a do a full table scan regardless of the index? — Aftab Narsimhan, Jul 13 '16 at 16:38

score 1 · Answer 1 · answered Jul 13 '16 at 01:52

1

This is the root cause of most of your problems.

I currently have 3 (simplified) tables on a MySQL (V5.5.14), InnoDB (V1.1.8):

1) There are actually four different versions of this table, where each store data of a certain type, but all have these 3 columns

All three tables having the same columns means this is essentially the same data with subtle differences. RDBMS systems have a built in mechanism for dealing with that. Partitions.

... by enabling you to distribute portions of individual tables across a file system according to rules which you can set largely as needed. In effect, different portions of a table are stored as separate tables in different locations. The user-selected rule by which the division of data is accomplished is known as a partitioning function

By using partitoning, you immidiately eliminate the need to use UNION. And your union can be simplified greatly.

As for why the simple query uses one index and the UNION query uses another, it's simply because of they seem have hugely different number of rows in your different PropDataTagX tables. If they all had similar number of rows, the same query plan may be used.

answered Jul 13 '16 at 01:52

e4c5

52,766
11
101
134

While this does make sense to me, having one partitioned PropData table wouldn't work for me. First and most importantly, the table structures are completely fixed and are a packaged solution that comes with another third party application. There are several legacy applications that rely on this table structure and changing the tables isn't an option. The only power I have is to create additional tables. – Aftab Narsimhan Jul 13 '16 at 02:06
Secondly, in each PropDataX table, I have the fields _Value_, _HighLimit_, _LowLimit_ and these contain doubles, blobs or varbinary depending on the table type. I admit that I don't know much about partitioning, but I don't know how I'd go about containing all of these values in one table, especially since these tables don't have the same number of columns – Aftab Narsimhan Jul 13 '16 at 02:08
I appreciate the response though, and for suggesting another topic for me to research! – Aftab Narsimhan Jul 13 '16 at 02:11
1

Very well, even in that case my answer that the vast differences in the row counts changes the query plan is still valid. – e4c5 Jul 13 '16 at 02:11
I had no idea that was possible. Two of the tables that I'm currently working with have no data in them, whereas the other two have around 10 million each. I just wish I knew how to reliably reproduce the result I saw the first time, and figure out why that first query was lightning fast. I still need to get that working so it doesn't bog down the application that uses this query. – Aftab Narsimhan Jul 13 '16 at 02:15
10 million vs 0, that is a huge difference! A table with only a few thousand records doesn't actually need an index! – e4c5 Jul 13 '16 at 02:56
Ideally all of the tables will have a large amount of data (in the order of ~50+ million), but I just happened to choose a database where two of the tables were unpopulated. I guess for now it makes more sense to just run the queries on only the tables with data in them. I just thought it would be more representative if I used queries for all tables, regardless of the data. – Aftab Narsimhan Jul 13 '16 at 03:01
what happened next? – e4c5 Jul 28 '16 at 01:29
Nothing unfortunately... I still haven't found a way around this. I've put this on the back burner while I finish off other parts of the application – Aftab Narsimhan Jul 28 '16 at 02:57

Why did MySQL change which indexes to use for the same query?

Edit 1:

Edit 2:

1 Answers1