I have several queries that could use some optimization since they take a considerable amount of time, so after reading many of the posts on this site I started to modify my schema and adding/changing indexes to appropriately speed up queries. In most cases I had great success, however in this particular case I'm stuck and I'm not sure what I'm doing wrong.
I have a table that has about 3.5 million rows in it.
select count(*) from post;
+----------+
| count(*) |
+----------+
| 3652904 |
+----------+
and is created as follows:
Create Table: CREATE TABLE `post` (
`id` varchar(255) NOT NULL DEFAULT '',
`page_id` bigint(20) NOT NULL DEFAULT '0',
`post_id` bigint(20) NOT NULL DEFAULT '0',
`type` varchar(45) CHARACTER SET latin1 NOT NULL,
...
`created_time` timestamp NULL DEFAULT NULL,
`updated_time` timestamp NULL DEFAULT NULL,
`timestamp` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`page_id`,`post_id`),
KEY `created_time` (`created_time`),
KEY `target_id` (`target_id`),
KEY `id` (`id`) USING BTREE
) ENGINE=TokuDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (page_id)
PARTITIONS 10 */
and another table that has very few rows and is created as follow:
select count(*) from privacy;
+----------+
| count(*) |
+----------+
| 19093 |
+----------+
Create Table: CREATE TABLE `privacy` (
`id` varchar(255) CHARACTER SET latin1 NOT NULL,
`page_id` bigint(20) DEFAULT '0',
`description` text CHARACTER SET latin1,
`value` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`allow` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`deny` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`json` text CHARACTER SET latin1,
`timestamp` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `page_id` (`page_id`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8
The select that I'm trying to optimize is as follow:
explain partitions
SELECT
post.id, post.type, privacy.description
FROM
post
LEFT JOIN privacy ON privacy.id = post.id
WHERE post.page_id = 12854644836;
+----+-------------+---------+------------+------+---------------+---------+---------+-------+-------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+-------+-------+
| 1 | SIMPLE | post | p6 | ref | PRIMARY | PRIMARY | 8 | const | 34685 | |
| 1 | SIMPLE | privacy | NULL | ALL | NULL | NULL | NULL | NULL | 19093 | |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+-------+-------+
Unfortunately this select is taking several minutes and I can't figure out why. I noticed that the explain doesn't pick up the primary key and I wonder if this is due to the difference in charset between the two tables. Even so from the explain there should not be that many rows involved, and yet it still takes that long to execute.
16754 rows in set (5 min 33.68 sec)
Selects from the tables alone are pretty fast.
select * from post where page_id = 12854644836;
16754 rows in set (0.22 sec)
and
select * from privacy where page_id = 12854644836;
234 rows in set (0.01 sec)
Can some of the MySQL Gurus on this site point me in the right direction? Thanks :)