I am posting this as a new question (from another question I asked here: How to efficiently select and group by a substring of a column), since I think the new problem that arose warrants its own post. I'm not sure if that was the right thing to do, but please let know if it wasn't.
I currently have 3 (simplified) tables on a MySQL (V5.5.14), InnoDB (V1.1.8):
1) There are actually four different versions of this table, where each store data of a certain type, but all have these 3 columns
+--------------------------------------------------+
| PropData |
+--------------+-----------+-----------------------+
| Id | BigTag | Date |
+--------------+-----------+-----------------------+
| [SomeGUID] | 10001AB | 1000-01-01 00:00:00 |
+--------------+-----------+-----------------------+
2) The records in this table mirrors the data in the above table, except it only contains the primary key (id) and the first 5 chars of the BigTag (to easily join with the LimitTags). There is one of these tables for each respective PropData table
+------------------------+
| PropDataTag |
+--------------+---------+
| Id | Tag |
+--------------+---------+
| [SomeGUID] | 10001 |
+--------------+---------+
3) There is only one version of this table
+-------------------+
| LimitTags |
+---------+---------+
| Tag | Model |
+---------+---------+
| 10001 | Base |
+---------+---------+
I am basically trying to get all unique pairs of tags and models for a range of data between some time span.
With these three tables, I eventually came up with the following query:
SELECT DISTINCT T.Tag, T.Model
FROM PropData P
JOIN PropDataTag N ON P.Id=N.Id
JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
The results look something like this:
+---------+----------+
| Tag | Model |
+---------+----------+
| 10001 | Base |
| 10002 | Base |
| 10003 | Base |
| 10004 | Base |
| 10001 | Upgrade |
| 10002 | Upgrade |
| 10001 | Crappy |
+---------+----------+
I have the following indexes:
1) PropData: PRIMARY(Id), IdxDate(Date), IdxTag(BigTag), IdxIdAndDate(Id, Date)
2) PropDataTag: PRIMARY(Id), IdxTag(Tag)
3) LimitTags: PRIMARY(Id), IdxTag(Tag), IdxTagAndModel(Tag, Model)
The first time I ran it, it went perfectly and I got my results (696 records) in 0.016 sec. I also ran the EXPLAIN command and got the following results. When I took the screenshot, I unfortunately neglected to expand the ref column so I don't know what two of the values are and am having trouble reproducing these results.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE T index IdxTag,IdxTagAndModel IdxTagAndModel 49 NULL 1427 Using index; Using temporary
1 SIMPLE N ref IdxTag IdxTag 7 NoIdea 1238 Using index; Distinct
1 SIMPLE P eq_ref PRIMARY,IdxDate,IdxIdAndDate IdxDate 38 NoIdea 1 Using where; Distinct
Thinking everything was well in hand, I then tried the exact same query on the all 4 PropData tables and UNIONed them together to get a complete list of the tag/model pairs. After running the query, it took more than 2 minutes before I stopped it for taking too long. I tried running the explain command on the original query shown above, using the exact same tables, and instead of getting the same results, I got the following:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE P range PRIMARY,IdxDate,IdxIdAndDate IdxDate 24 NULL 1785585 Using where; Using Index; Using temporary
1 SIMPLE N eq_ref PRIMARY,IdxTag PRIMARY 38 P.Id 1
1 SIMPLE T ref IdxTag,IdxTagAndModel IdxTag 7 N.Tag 1
Running the original query now takes > 30 sec instead of ~0.016 sec. The data and the indexes on these tables have definitely not changed, and I ran the explain command for these queries about 5 minutes apart.
What just happened? I can't figure out:
Why did the order of the rows in the explain output change?
Why did MySQL decide to use different indices all of a sudden?
Does anyone have any ideas or input? I searched around for other posts, but no one seems to have experienced the following results.
Edit 1:
I was able to reproduce this once when I tried to cancel the query (without explain) mid execution, which caused MySQL Workbench to crash. On restart, it worked the first time, giving me the results instantly. When I ran the query with all 4 tables, the indexes switched again and I experienced the same phenomenon as above, but with a new, different set of EXPLAIN results:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE P range PRIMARY,IdxDate,IdxIdAndDate IdxDate 24 NULL 1796958 Using where; Using Index; Using temporary
1 SIMPLE N eq_ref PRIMARY,IdxTag PRIMARY 38 P.Id 1
1 SIMPLE T ref IdxTag,IdxTagAndModel IdxTagAndModel 7 N.Tag 1 Using index
I tried forcing the query to use the same indexes that were shown in the desired first set of EXPLAIN outputs:
SELECT DISTINCT T.Tag, T.Model
FROM PropData P FORCE INDEX (PRIMARY)
JOIN PropDataTag N FORCE INDEX (IdxTag) ON P.Id=N.Id
JOIN LimitTags T FORCE INDEX (IdxTagAndModel) ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
And I got these results from explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE N index IdxTag PRIMARY 7 P.Id 1 Using index; Using temporary
1 SIMPLE T ref IdxTagAndModel IdxTagAndModel 7 N.Tag 1 Using index
1 SIMPLE P eq_ref PRIMARY IdxDate 38 NULL 1796958 Using where; Distinct
The main differences I see between these latest results and the original working version is that the IdxTagAndModel key only has a key_len of 7 rather than 49, and that table N doesn't have Distinct in the extra column.
Other differences to note are that the number of rows is different for table P, and that Using Temporary is on table N instead of T.
Edit 2:
Here is the full query I perform that seemed to switch which indexes were used:
SELECT DISTINCT T.Tag, T.Model
FROM PropData1 P
JOIN PropDataTag1 N ON P.Id=N.Id
JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData2 P
JOIN PropDataTag2 N ON P.Id=N.Id
JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData3 P
JOIN PropDataTag3 N ON P.Id=N.Id
JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
UNION
SELECT DISTINCT T.Tag, T.Model
FROM PropData4 P
JOIN PropDataTag4 N ON P.Id=N.Id
JOIN LimitTags T ON N.Tag=T.Tag
WHERE P.Date BETWEEN '0000-01-01 00:00:00' AND '9999-12-31 23:59:59'
I didn't include it initially because it's literally the same query repeated 3 times on different tables. Each table holds a different kind of data such as a double, or a BLOB, but they aren't used at all in this query.