Background
users
table has 2k rowsrelationships
table has 1.5 million rowsposts
table has 2 million rows- using mysql version 5.7.34
Structure for users
:
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL DEFAULT '',
`first_name` varchar(255) NOT NULL DEFAULT '',
`last_name` varchar(255) NOT NULL DEFAULT '',
`password` varchar(255) NOT NULL DEFAULT '',
`active` tinyint(1) NOT NULL,
`created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=3263 DEFAULT CHARSET=utf8
Structure for relationships
:
CREATE TABLE `relationships` (
`user_id` int(11) unsigned NOT NULL,
`is_following_user_id` int(11) unsigned NOT NULL,
`created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY `user_id` (`user_id`,`is_following_user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Structure for posts
:
CREATE TABLE `posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned NOT NULL,
`parent_post_id` int(11) DEFAULT NULL,
`content` varchar(255) DEFAULT '',
`created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
CONSTRAINT `users` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2412061 DEFAULT CHARSET=utf8
NOTE: User 922 has no relationships or posts, therefore the queries are doing a full index and/or table scan as needed.
This query takes 0.5ms:
# 0.5ms
select * from posts where user_id in (
select id from users inner join relationships
on users.id = relationships.is_following_user_id
where relationships.user_id = 922
);
Explain output for above fast query:
This query takes 500ms:
# 500ms
select * from posts where user_id in (
select id from users inner join relationships
on users.id = relationships.is_following_user_id
where relationships.user_id = 922
)
or user_id = 922;
Explain output for above slow query:
Clearly for the second query it has identified the same index as the first query (users.user_id)
, but in the second query, as per the explain output, it's specifically avoiding using it (key = NULL)
.
This query takes 2.3 seconds:
# 2.3 seconds
select * from posts where user_id in (
select id from users inner join relationships
on users.id = relationships.is_following_user_id
where relationships.user_id = 922
union all
select 922
);
Explain output for above super slow query:
Questions:
- Why is query #2 not using the
users.user_id
index like query #1? - Why is query #3 so much slower, and also not using the
users.user_id
index?