We have 2 tables:
CREATE TABLE `task_information`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`project_id` varchar(64) NOT NULL DEFAULT '0',
`task_id` varchar(64) NOT NULL DEFAULT '0',
`machine_id` varchar(16) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `idx_project` (`project_id`,`task_id`),
KEY `idx_task` (`task_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1100523 DEFAULT CHARSET=utf8mb4
CREATE TABLE `task_process`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`task_id` varchar(64) NOT NULL DEFAULT '0',
`project_id` varchar(64) NOT NULL DEFAULT '0',
`status` varchar(16) NOT NULL DEFAULT '0'
PRIMARY KEY (`id`),
UNIQUE KEY `uniq_task` (`task_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1096437 DEFAULT CHARSET=utf8mb4
These two tables have a one-to-one relationship based on "task_id".
The two tables have almost 800K entries. Also there will be lots of "success" in the "status" column. The "status" column has exactly three values: "success", "fail" and "start", their ratio is roughly 100:1:1.
And now I want to select the task_id where machined_id is specified and the status equals "success", my sql query looks like this:
select *
from task_information
inner join task_process on task_process.task_id = task_information.task_id
and task_information.machine_id = "RA00906"
where task_process.status="success"\G;
This one takes 4.96s, the result from explain is as follows:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: task_process
partitions: NULL
type: ALL
possible_keys: uniq_task
key: NULL
key_len: NULL
ref: NULL
rows: 760145
filtered: 10.00
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: task_information
partitions: NULL
type: ref
possible_keys: idx_task
key: idx_task
key_len: 258
ref: task_process.task_id
rows: 1
filtered: 10.00
Extra: Using where
But if I change "=" to "like" the speed becomes much faster (around 0.9 seconds):
select status
from task_information
inner join task_process on task_process.task_id = task_information.task_id
and task_information.machine_id = "RA00906"
where task_process.status like "success"
The result from explain is like this:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: task_information
partitions: NULL
type: ALL
possible_keys: idx_task
key: NULL
key_len: NULL
ref: NULL
rows: 759749
filtered: 10.00
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: task_process
partitions: NULL
type: eq_ref
possible_keys: uniq_task
key: uniq_task
key_len: 258
ref: task_information.task_id
rows: 1
filtered: 11.11
Extra: Using where
Both of these queries return around 500 rows, but "LIKE" is much faster than "=" even though functionally they should be roughly the same.
Can someone please explain to me the reason behind this?
tl;dr using "LIKE" is much faster than "=" in the same "inner join" query, what is the reason behind this?