-1

I have a 40M record table having id as the primary key. I execute a select statement as follows:

select * from messages where (some condition) order by id desc limit 20;

It is ok and the query executes in a reasonable time. But when I add an always valid condition as follows, It takes a huge time.

select * from messages where id > 0 and (some condition) order by id desc limit 20;

I guess it is a bug and makes MySQL search from the top side of the table instead of the bottom side. If there is any other justification or optimization it would be great a help.

p.s. with a high probability, the results are found in the last 10% records of my table.

p.p.s. the some condition is like where col1 = x1 and col2 = x2 where col1 and col2 are indexed.

s.abbaasi
  • 952
  • 6
  • 14
  • 1
    Do you see the same effect if you replace `id > 0` with `1 = 1`? – Barmar Jan 11 '23 at 19:38
  • 4
    Use `EXPLAIN` to see the difference in query plans. – Barmar Jan 11 '23 at 19:39
  • You know it's "always valid". Does the DBMS? What did you tell it, that it would know that? PS Debug questions require a [mre]. Good questions "show research effort". [ask] [Help] – philipxy Jan 11 '23 at 21:02
  • 1
    [Tips for asking a good SQL question](https://meta.stackoverflow.com/q/271055/3404097) [Asking query performance questions](https://dba.meta.stackexchange.com/q/3034/43932) [How to Optimize Queries in a Database - The Basics](https://stackoverflow.com/q/3191623/3404097) [What are your most common sql optimizations?](https://stackoverflow.com/q/1332778/3404097) [When and why are database joins expensive?](https://stackoverflow.com/q/173726/3404097) – philipxy Jan 11 '23 at 21:04
  • If the "valid condition" is a `MATCH`, then all sorts of different issues arise. So... Please provide the actual condition. – Rick James Jan 12 '23 at 05:28
  • @RickJames I edited the post and provided some explanations on `some condition` – s.abbaasi Jan 12 '23 at 23:17

3 Answers3

3

MySQL has to choose whether to use an index to process the WHERE clause, or use an index to control ORDER BY ... LIMIT ....

In the first query, the WHERE clause can't make effective use of an index, so it prefers to use the primary key index to optimize scanning in order by ORDER BY. In this case it stops when it finds 20 results that satisfy the WHERE condition.

In the second query, the id > 0 condition in the WHERE clause can make use of the index, so it prefers to use that instead of using the index for ORDER BY. In this case, it has to find all the results that match the WHERE condition, and then sort them by id.

I wouldn't really call this a bug, as there's no specification of precisely how a query should be optimized. It's not always easy for the query planner to determine the best way to make use of indexes. Using the index to filter the rows using WHERE id > x could be better if there aren't many rows that match that condition.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thanks, @Barmar. It makes sense. But is it true that database engine use just one index to scan the results? – s.abbaasi Jan 11 '23 at 20:32
  • 1
    MySQL is pretty limited in that respect. Read [How MySQL Uses Indexes](https://dev.mysql.com/doc/refman/8.0/en/mysql-indexes.html) – Barmar Jan 11 '23 at 20:36
  • 1
    MySQL uses one index per table reference, mostly. There are exceptions like [index merge optimization](https://dev.mysql.com/doc/refman/8.0/en/index-merge-optimization.html), but somehow that feature doesn't kick in as often as you might think it should. – Bill Karwin Jan 11 '23 at 22:34
  • Great answer, could you help to show me where to find short documents about how mysql choosing indexes, like you said in the first sentence? – Hi computer Jan 13 '23 at 09:46
  • See the link in my second comment. – Barmar Jan 13 '23 at 15:12
0

A query like this

select  *
    from  messages
    where  col1 = x1
      and  col2 = x2
    order by  id desc
    limit  20;

is best handled by a 'composite' index with the tests for '=' first:

INDEX(col1, col2, id)

Try it, I think it will be faster than either of the queries you are working with.

Rick James
  • 135,179
  • 13
  • 127
  • 222
-1

It’s not a bug. You are searching through a 40 million table where your where clause doesn’t have an index. Add an index on the column in your where clause and you will see substantial improvement.

  • But it shouldn't search the entire table, because it should use the `ORDER BY` to search it in the order of the indexed `id` column, and stop when it gets 20 results. – Barmar Jan 11 '23 at 19:39
  • That's what it does in the original query without `id > 0` in the `WHERE` clause. – Barmar Jan 11 '23 at 19:39
  • Dear @Ranabir Dey, as I wrote, the search process is fine just without id>0 and I managed the necessary indices before. – s.abbaasi Jan 11 '23 at 20:27