SQL: the order of "BETWEEN AND" and "<> (!=)"

Question

A sql query：

For example:

select id from user where age != 20 and age between 18 and 30;
select id from user where age between 18 and 30 and age != 20;

The above is just a simple example.

Question:

If there is a lot of data, the above two statements will have any difference in performance? How can I choose it?

Tips: If the field 'age' already has an index.

Unlike the above question. I have been asked the question is:

I do not care the order of "where",I only care about if the different order of <> and bewteen and has different effects, because they belong to the scope of the query, and the scope of the query may exist data tilt problem.

Just like @axiac said

Check out what the query optimizer is doing by using EXPLAIN — Spechal, May 21 '17 at 07:27

axiac · Accepted Answer · 2017-05-21T07:41:34.170

Afaik the MySQL query optimizer generates the same execution plan for both your queries. It uses the index to check only the rows having age between 18 and 30 then, for each row, it checks the condition age != 20 (also using the data from index).

The explanation above assumes the values in the age columns are various. MySQL uses the age between 18 and 30 condition first because it filters out more rows than age != 20. However, if the data in the table is not balanced and most of the rows in the table have age = 20, MySQL will pick this age != 20 condition first (because it eliminates more rows than the other condition).

I cannot tell how many rows must have age = 20 in order for the query optimizer use the age != 20 condition for filtering but I'm sure the percentage has to be high; it could reach the threshold if the persons in your table are selected from a limited group (f.e. students on the same year of study) but it won't reach it for sure if the ages from the table follow the normal distribution.

MySQL uses one condition or the other to filter out as many rows as it can based on the actual data you have in the table when you run the query. It can change the execution plan later, when the data in the table changes.

If id is the PK of your table, it has an index on age and it uses the InnoDB engine then the query above doesn't need to read the table data. Both the WHERE conditions and the SELECT expressions can be evaluated using the index information. I would tell that no further optimization is possible for it.

SQL: the order of "BETWEEN AND" and "<> (!=)"

1 Answers1