0

Need to search through table foo

foo structure is

id | something

There is an INDEX for field something

I want to search BETWEEN AS INT:

SELECT CAST(something as INT) as something_int FROM foo foo_1
WHERE something_int > 1 AND something_int < 9999

In this case the INDEX would be used or be broken?

123
  • 2,169
  • 3
  • 11
  • 35

2 Answers2

1
WHERE some_varchar BETWEEN '1' AND '2000'  -- fast but probably incorrect
WHERE some_varchar BETWEEN 1 AND 2000      -- slow but correct
WHERE some_int BETWEEN '1' AND '2000'      -- fast
WHERE some_int BETWEEN 1 AND 2000          -- fast (same as previous)

What is happening?

  • When comparing text to numeric, the text side is converted to numeric, and then numeric comparisons are performed.
  • Text to text comparison does a string comparison; numeric to numeric does a numeric comparison.
  • Above, I say "slow" meaning that no index can be used; "fast" if an index can be used.
  • The "incorrect" one has the same problem as sorting a set of numbers in a VARCHAR and then wondering why the list is out of order: 1,10,11,...,19,2,20,...,29,3, ...
  • CAST() is just an explicit version of the implicit conversion I am talking about here.
  • CAST('2000' TO INT) is done "at compile time", so the Optimizer sees it as simply 2000 (numeric, no function call).
  • some_varchar >= 1 on the other hand, is turned into CAST(some_varchar TO INT) >= 1 in the second example above.
  • As a Rule of Thumb, "hiding a column in a function call prevents using an index. See "sargable" in Wikipedia.
Rick James
  • 135,179
  • 13
  • 127
  • 222
0

No, the index will not be used.

CREATE TABLE foo(something varchar(20) primary key) engine=myisam;
INSERT INTO foo VALUES ('1|abc'), ('3456|def');

DESCRIBE SELECT * FROM foo WHERE CAST(something as INT) BETWEEN 1 AND 2000;
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
|    1 | SIMPLE      | tt    | index | NULL          | PRIMARY | 82      | NULL |    2 | Using where; Using index |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+

DESCRIBE SELECT * FROM foo WHERE something BETWEEN '1' AND '2000';
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
|    1 | SIMPLE      | tt    | range | PRIMARY       | PRIMARY | 82      | NULL |    1 | Using where; Using index |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+

Notice how possible_keys is NULL for the first query (and rows found is 2).

Note: this can happen even if the character set of the query doesn't match that of the index.

Create a separate INT index (e.g. using the function index syntax).

rustyx
  • 80,671
  • 25
  • 200
  • 267
  • 1
    What about "index" vs "range"? "Rows" is an estimate; you should not deduce anything from a minor difference such as "2" vs "1". Technically the first one _does_ use the index, but as being more efficient than using the data. (And there is a significant difference between MyISAM and InnoDB int this case.) The second one _does_ use the index -- as indicated by "range" and "Primary"; it is a simple string comparison. It is not the numeric comparison that was probably desired. Function indexes are not available until recent versions. – Rick James Jan 15 '20 at 01:05