-1

I have searched all my Google links purple and found no solution to my problem.

From what I can tell MySQL is NOT short circuiting the OR operator in a where clause. A lot of search results explain why this could happen, but that does not practically help solve my problem.

We are running MySQL version 5.7.36.

Here is the results of the performance difference in the query:

Query with OR clause (2 s 761 ms):

select count(*) from account_transaction
       where (true or description like '%')
         and datetime between '2022-05-19 00:00:00' and '2022-05-26 00:00:00'
[2022-05-27 08:28:36] 1 row retrieved starting from 1 in 2 s 787 ms (execution: 2 s 761 ms, fetching: 26 ms)

Execution plan:

mysql> explain select count(*) from account_transaction
    ->        where (true or description like '%')
    ->          and datetime between '2022-05-19 00:00:00' and '2022-05-26 00:00:00';
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+---------+----------+-------------+
| id | select_type | table               | partitions | type | possible_keys                    | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | account_transaction | NULL       | ALL  | account_transaction_datetime_idx | NULL | NULL    | NULL | 1172455 |    19.69 | Using where |
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.03 sec)

Query without OR clause (267ms):

select count(*) from account_transaction
       where (true)
         and datetime between '2022-05-19 00:00:00' and '2022-05-26 00:00:00'
[2022-05-27 08:26:09] 1 row retrieved starting from 1 in 286 ms (execution: 267 ms, fetching: 19 ms)

Execution plan:

mysql> explain select count(*) from account_transaction
    ->        where (true)
    ->          and datetime between '2022-05-19 00:00:00' and '2022-05-26 00:00:00';
+----+-------------+---------------------+------------+-------+----------------------------------+----------------------------------+---------+------+--------+----------+--------------------------+
| id | select_type | table               | partitions | type  | possible_keys                    | key                              | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+---------------------+------------+-------+----------------------------------+----------------------------------+---------+------+--------+----------+--------------------------+
|  1 | SIMPLE      | account_transaction | NULL       | range | account_transaction_datetime_idx | account_transaction_datetime_idx | 6       | NULL | 230898 |   100.00 | Using where; Using index |
+----+-------------+---------------------+------------+-------+----------------------------------+----------------------------------+---------+------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

This query is generated by Spring JPA and therefore no convenient way to generate the query ourselves.

Tried adding another index, but the execution plan remains the same:

create index account_transaction__indextest
    on account_transaction (description asc, datetime desc);

Execution plan with additional index:

mysql> explain select * from account_transaction
    -> where (true or description like '%')
    ->   and datetime between '2022-05-19 00:00:00' and '2022-05-26 00:00:00'
    -> ;
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+------+----------+-------------+
| id | select_type | table               | partitions | type | possible_keys                    | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | account_transaction | NULL       | ALL  | account_transaction_datetime_idx | NULL | NULL    | NULL |  160 |    80.62 | Using where |
+----+-------------+---------------------+------------+------+----------------------------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
leonardseymore
  • 533
  • 5
  • 20
  • 1
    Where are execution plans for your queries? – Akina May 27 '22 at 06:57
  • I'll update the question and paste it there since the comment block doesn't have enough space. It already indicates that the index is not being used for the OR query – leonardseymore May 27 '22 at 07:07
  • 2
    Of course, the facts **must** be added into the question, not posted as a comment. – Akina May 27 '22 at 07:08
  • Updated question with the execution plans – leonardseymore May 27 '22 at 07:09
  • 1
    The source of the execution time difference is obvious, but I do not see the reasonable way to improve without an access to the query text (index forcing hint would help maybe). You may try to include `description` column into the index expression as a postfix (or a column prefix, like `(datetime, description(1))`). – Akina May 27 '22 at 07:14
  • I was thinking the same thing and try to avoid database hints to try and stay DB agnostic, especially since we use JPA abstractions. It looks like I might have to include additional columns in the index. This is just very unexpected behavior, but at least highlights why the query is taking much longer – leonardseymore May 27 '22 at 07:16

1 Answers1

1
  • Please provide SHOW CREATE TABLE so we know the datatypes, indexes, etc.
  • When timing a query, run it twice. The first might be slow because of loading the cache; hence the second may be faster.
  • The Optimizers looks at some crude statistics to decide which index would be most useful, or that no index is worth bothering with. To analyze your queries from that point of view, what count did you get from your queries? And how many rows are in the table?
  • The second EXPLAIN seems to imply that some index starts with datetime.
  • I prefer to construct queries without things like true or 1 or 1=1 or ... LIKE '%'. It makes the Optimizer work a little harder. And apparently it distracts it from "doing the right thing".
  • The "Using where" may be indicating that it is checking description LIKE '%'. If so, you could write a bug report to suggest optimizing that to TRUE. But it may be rejected because that may actually be treated as description IS NOT NULL ! Run SHOW WARNINGS; after performing the EXPLAIN.
  • The lack of "Using index" in Extras indicates that it had to reach into the data's BTree. That is, either no index was "covering", or the optimizer chose scan the table for other reasons.
  • 'Apples and oranges' -- The EXPLAINs and the timings for SELECT * is usually quite different than SELECT COUNT(*). Pick which one you want to discuss; don't mix them.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I picked up something new when experimenting with the date range. Index is used depending on the dates. I see you answered this in https://stackoverflow.com/questions/66009634/mysql-date-range-query-indexing-is-working-for-limited-range-only – leonardseymore May 30 '22 at 07:45