Whenever you write a query where you need to filter out rows on a range of values - then should I use the BETWEEN
clause or <=
and >=
?
Which one is better in performance?
Whenever you write a query where you need to filter out rows on a range of values - then should I use the BETWEEN
clause or <=
and >=
?
Which one is better in performance?
Neither. They create exactly the same execution plan.
The times where I use them depends not on performance, but on the data.
If the data are Discrete Values, then I use BETWEEN
...
x BETWEEN 0 AND 9
But if the data are Continuous Values, then that doesn't work so well...
x BETWEEN 0.000 AND 9.999999999999999999
Instead, I use >= AND <
...
x >= 0 AND x < 10
Interestingly, however, the >= AND <
technique actually works for both Continuous and Discrete data types. So, in general, I rarely use BETWEEN
at all.
Also, don't use BETWEEN
for date/time range queries.
What does the following really mean?
BETWEEN '20120201' AND '20120229'
Some people think that means get me the all the data from February, including all of the data anytime on February 29th. The above gets translated to:
BETWEEN '20120201 00:00:00.000' AND '20120229 00:00:00.000'
So if there is data on the 29th any time after midnight, your report is going to be incomplete.
People also try to be clever and pick the "end" of the day:
BETWEEN '00:00:00.000' AND '23:59:59.997'
That works if the data type is datetime
. If it is smalldatetime
the end of the range gets rounded up, and you may include data from the next day that you didn't mean to. If it's datetime2
you might actually miss a small portion of data that happened in the last 2+ milliseconds of the day. In most cases statistically irrelevant, but if the query is wrong, the query is wrong.
So for date range queries I always strongly recommend using an open-ended range, e.g. to report on the month of February the WHERE
clause would say "on or after February 1st, and before March 1st" as follows:
WHERE date_col >= '20120201' AND date_col < '20120301'
BETWEEN
can work as expected using the date
type only, but I still prefer an open-ended range in queries because later someone may change that underlying data type to allow it to include time.
I blogged a lot more details here: