Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?

Question

I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:

SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);

and:

SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;

both are totally different thing. One checks range like analog signal, another one check states like digital signal. So this dont come for performance comparison — Sadat, Jul 22 '10 at 11:27
logically the between should perform better, since it will make two comparisons per element instead of the number of ids in the IN case (*but this is just my feeling .. no hard evidence to support this*) — Gabriele Petrioli, Jul 22 '10 at 11:27
You should consider reassigning the accepted answer (if that is possible) to LukasEnder. Andomar's answer is wrong and LukasEnder explains why. — Code Commander, Oct 06 '12 at 17:34

Lukas Eder · Accepted Answer · 2016-12-30T10:29:35.337

19

BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:

m is the size of your table
n is the size of your range

Index can be used (`n` is tiny compared to `m`)

In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)
IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)

Index cannot be used (`n` is a significant portion of `m`)

In any case, you'll get a full table scan and evaluate the predicate on each row:

BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)
IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.

edited Dec 30 '16 at 10:29

answered Jun 15 '11 at 07:29

Lukas Eder

211,314
129
689
1,509

1

I would expect a range scan to be better than a unique scan for scanning ranges. Otherwise, why would Oracle implement the range scan at all? – Andomar Oct 07 '12 at 20:08
On mysql IN can have serious performance impacts in comparison to between, I've seen stalls of many seconds where IN contained a few thousand numbers. The same query with BETWEEN takes a few milliseconds. So when you have the choice: always use between – John May 26 '20 at 02:40
@John: Never say "always". If your `IN` list has 1-2 elements, I somewhat doubt that your claim is correct. – Lukas Eder May 26 '20 at 09:46
@LukasEder Between will be as fast as IN when using 2+ elements, so you lose nothing when using it. And when using 1 element you've no reason to use any of both. The 'always' stays valid no matter if 2 elements or 2 million, the more elements you target the more efficient it will be. There is no reason whatsoever to use IN when BETWEEN can achieve the same, both syntax variants have their purpose. When using IN as replacement for BETWEEN you rape it's purpose. – John May 26 '20 at 12:07

score 16 · Answer 2 · answered Jul 22 '10 at 11:29

16

a between b and c is a macro that expands to b <= a and a <= c.

a in (b,c,d) is a macro that expands to a=b or a=c or a=d.

Assuming your n and nk are integer, both should end up meaning the same. The between variant should be much faster because it's only two compares, versus nk - n compares for the in variant.

answered Jul 22 '10 at 11:29

Andomar

232,371
49
380
404

The shorter string of the BETWEEN clause also parses more quickly. – Erick Robertson Jul 22 '10 at 11:31
Great, thanks. I'd give you the Answer right now but SO says I need to wait 7 minutes. – pr1001 Jul 22 '10 at 11:32
@LukasEder is right. Depending on your indexes, `IN` can turn out to be much faster. The best way to know is to benchmark both options in your particular case. – Code Commander Oct 06 '12 at 17:31
actually `a in (b,c,d)` is a macro which expands to `a = any (b,c,d)` (see SQL-92 standard) – eci Oct 23 '12 at 12:14

score 4 · Answer 3 · edited May 23 '17 at 12:08

I have done research for this question. I have 11M rows in my table. I have executed two queries on that:

Query 1:SELECT * FROM PLAYERS WHERE SCORE BETWEEN 10 TO 20

Query 2:SELECT * FROM PLAYERS WHERE SCORE IN (10,11,...,20)

While execution time, both queries are translated as Andomar said above.

Among both queries, Query 1 is running faster than Query 2.

To know more follow this link:

Performance of BETWEEN VS IN() in MySQL

Thank you.

score 0 · Answer 4 · answered Oct 29 '20 at 17:06

0

In many database servers, IN() is just a synonym for multiple OR clauses, because the two are logically equivalent. Not so in MySQL, which sorts the values in the IN() list and uses a fast binary search to see whether a value is in the list. This is O(Log n) in the size of the list, whereas an equivalent series of OR clauses is O(n) in the size of the list (i.e., much slower for large lists)

answered Oct 29 '20 at 17:06

Dhrubo Saha

11
1
4

I believe this is a direct quote from "High Performance MySQL: Optimization, Backups, and Replication" is it not? If so, it should be noted as such! – Aaron Francis Nov 16 '21 at 03:10

Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?

4 Answers4

Index can be used (`n` is tiny compared to `m`)

Index cannot be used (`n` is a significant portion of `m`)

Linked

Related

Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?

4 Answers4

Index can be used (n is tiny compared to m)

Index cannot be used (n is a significant portion of m)

Linked

Related

Index can be used (`n` is tiny compared to `m`)

Index cannot be used (`n` is a significant portion of `m`)