EDIT: there was a mistake in the following question that explains the observations. I could delete the question but this might still be useful to someone. The mistake was that the actual query running on the server was SELECT * FROM t
(which was silly) when I thought it was running SELECT t.* FROM t
(which makes all the difference). See tobyobrian's answer and the comments to it.
I've a too slow query in a situation with a schema as follows. Table t
has data rows indexed by t_id
. t
adjoins tables x
and y
via junction tables t_x
and t_y
each of which contains only the foreigns keys required for the JOINs:
CREATE TABLE t (
t_id INT NOT NULL PRIMARY KEY,
data columns...
);
CREATE TABLE t_x (
t_id INT NOT NULL,
x_id INT NOT NULL,
PRIMARY KEY (t_id, x_id),
KEY (x_id)
);
CREATE TABLE t_y (
t_id INT NOT NULL,
y_id INT NOT NULL,
PRIMARY KEY (t_id, y_id),
KEY (y_id)
);
I need to export the stray rows in t
, i.e. those not referenced in either junction table.
SELECT t.* FROM t
LEFT JOIN t_x ON t_x.t_id=t.t_id
LEFT JOIN t_y ON t_y.t_id=t.t_id
WHERE t_x.t_id IS NULL OR t_y.t_id IS NULL
INTO OUTFILE ...;
t
has 21 M rows while t_x
and t_y
both have about 25 M rows. So this is naturally going to be a slow query.
I'm using MyISAM so I thought I'd try to speed it up by preloading the t_x
and t_y
indexes. The combined size of t_x.MYI
and t_y.MYI
was about 1.2 M bytes so I created a dedicated key buffer for them, assigned their PRIMARY keys to the dedicated buffer and LOAD INDEX INTO CACHE'ed them.
But as I watch the query in operation, mysqld is using about 1% CPU, the average system IO pending queue length is around 5, and mysqld's average seek size is in the 250 k range. Moreover, nearly all the IO is mysqld reading from t_x.MYI
and t_x.MYD
.
I don't understand:
Why mysqld is reading the
.MYD
files at all?Why mysqld isn't using the preloaded the
t_x
andt_y
indexes?
Could it have something to do with the t_x
and t_y
PRIMARY keys being over two columns?
EDIT: The query explained:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------+---------+-----------+----------+-------------+
| 1 | SIMPLE | t | ALL | NULL | NULL | NULL | NULL | 20980052 | |
| 1 | SIMPLE | t_x | ref | PRIMARY | PRIMARY | 4 | db.t.t_id | 235849 | Using index |
| 1 | SIMPLE | t_y | ref | PRIMARY | PRIMARY | 4 | db.t.t_id | 207947 | Using where |
+----+-------------+-------+------+---------------+---------+---------+-----------+----------+-------------+