I stumbled upon this today and was quite shocked. When searching Google I normally see the question is revered as in - using limit
caused it to return slower.
I have a MySQL table with a few million rows in it.
The PK is id
and as such it's a unique index.
When I performed a query of the form select a, b, c, ... from table where id in (1, 2, 3, ..., 5000)
it took about 15-20 minutes to fetch all results.
However, when I simply added limit 1000000
at the end (I used an extremely larger number than needed on purpose), it returned in a few seconds.
I know that using limit
with smaller numbers than returned help as it's returning as soon as the "quota" is filled, but here I can't find the reason for such a dramatic improvement.
Can anyone please explain it?
Should I just add a limit
to every query to improve its performance?
Why doesn't MySQL searches with and without it the same?
Update
Per requested the explain for each:
With limit (takes a few seconds)
{
"id" : 1,
"select_type" : "SIMPLE",
"table" : "table",
"partitions" : null,
"type" : "range",
"possible_keys" : "PRIMARY",
"key" : "PRIMARY",
"key_len" : "4",
"ref" : null,
"rows" : 4485,
"filtered" : 100.0,
"Extra" : "Using where"
}
Without limit (takes 15-20 minutes)
{
"id" : 1,
"select_type" : "SIMPLE",
"table" : "table",
"partitions" : null,
"type" : "ALL",
"possible_keys" : "PRIMARY",
"key" : null,
"key_len" : null,
"ref" : null,
"rows" : 69950423,
"filtered" : 50.0,
"Extra" : "Using where"
}
I'm not fluent in this but it looks like it used the key when I used limit
but it didn't when I ran it without it.
Possibly other differences in the filtered
and type
fields which I don't know what they mean.
How come?
Update 2
A lot of questions asked so I'll attempt to provide details for all.
The MySQL version is 8.0.28 and the table engine is InnoDB.
I've ran the tests a few times one after the other, not only once.
Running the same EXPLAIN
with fewer (10) values in the IN
clause returned the same result for both with limit
and without it!
{
"id" : 1,
"select_type" : "SIMPLE",
"table" : "table",
"partitions" : null,
"type" : "range",
"possible_keys" : "PRIMARY",
"key" : "PRIMARY",
"key_len" : "4",
"ref" : null,
"rows" : 10,
"filtered" : 100.0,
"Extra" : "Using where"
}
Now the FORMAT=JSON
(with redacted parts):
Without limit
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "8369910.88"
},
"table": {
"table_name": "table",
"access_type": "ALL",
"possible_keys": [
"PRIMARY"
],
"rows_examined_per_scan": 70138598,
"rows_produced_per_join": 35069299,
"filtered": "50.00",
"cost_info": {
"read_cost": "4862980.98",
"eval_cost": "3506929.90",
"prefix_cost": "8369910.88",
"data_read_per_join": "558G"
},
"used_columns": [...],
"attached_condition": "(`db`.`table`.`id` in (...))"
}
}
}
With limit
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "8371410.92"
},
"table": {
"table_name": "table",
"access_type": "range",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"rows_examined_per_scan": 4485,
"rows_produced_per_join": 35069255,
"filtered": "100.00",
"cost_info": {
"read_cost": "4864485.17",
"eval_cost": "3506925.54",
"prefix_cost": "8371410.92",
"data_read_per_join": "558G"
},
"used_columns": [...],
"attached_condition": "(`db`.`table`.`id` in (...))"
}
}
}