0

Query:

select * from table_a where col_a = 'value1' and col_b ='value2' order by id desc limit 1

Indexes:

col_a is indexed but col_b is not. col_a has a high cardinality (2M)

The entire table consists of 28M rows. No. of rows with col_a = 'value1' is 22,000. The latest id is 28M. The latest rows with col_a = 'value1' has id somewhere in 25M-25.5M range.

Ideally it should scan these 22000 rows only and give us the result. But we have seen that mysql is scanning these 3M rows (28M - 25M primary key id value) and then returning the result.

Using mysql explain we found out that PRIMARY key is being used if the limit is set to less than 20 but after that user_id is being prioritised.

Has anyone else seen this behaviour? Is there any flag which can be set which will avoid scanning primary key ? (we don't want to use force index(col_a_idx). Is there any other way which could avoid this ?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828

1 Answers1

1

Although How to hint the index to use in a MySQL select query? discusses the question as stated, I suggest that there is a better way to optimize the query.

INDEX(col_a, col_b, id)

(And Drop INDEX(col_a))

This will allow the query to run faster than forcing the use of PRIMARY KEY(id).

With this index, the Optimizer will automatically use it and look at exactly 1 row, not 28M, not 22000.

If col_b is TEXT, this new index won't work. Let's see SHOW CREATE TABLE, and please explain what type of stuff is in col_b.

Perhaps there is a datatype issue. Perhaps there is something goofy about your index(col_a).

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • So basically we dont want to create another index. Looked at your index hint and that solved our problem. It started using the col_a index. Col_b was integer with 2 values and that's why we didnt create index on that. – theprikshit Mar 20 '21 at 10:33
  • Also not sure how beneficial it would be to create index with primary key. Since we are not filtering directly on id, it would jus use up extra index space – theprikshit Mar 20 '21 at 10:35
  • @theprikshit - A composite (multi-column) index is _not_ the same as separate indexes on each column. It is often much better. – Rick James Mar 20 '21 at 15:25