Improving select statement performance

Question

I have a table with around 500,000 rows, with a composite primary key index. I'm trying a simple select statement as follows

select * from transactions where account_id='1' and transaction_id='003a4955acdbcf72b5909f122f84d';

The explain statement gives me this

id | select_type | table        | partitions | type  | possible_keys | key     | key_len | ref         | rows | filtered | Extra
-------------------------------------------------------------------------------------------------------------------------------------
1 | SIMPLE       | transactions | NULL       | const | PRIMARY       | PRIMARY | 94      | const,const | 1    | 100.00    | NULL

My primary index is on account_id and transaction_id. My engine is InnoDB.

When I run my query it takes around 156 milliseconds.

Given that explain shows that only one row needs to be examined, I'm not sure how to optimize this further. What changes can I do to significantly improve this?

Do read: [Why is SELECT * considered harmful?](https://stackoverflow.com/q/3639861/2469308) — Madhur Bhaiya, Dec 13 '18 at 08:56
@Cynical that his query is taking too long and he's asking why and eventually how to optimize it? — Alexandre Elshobokshy, Dec 13 '18 at 08:58
"Too long" based on what? Since he's using a selection on the primary key, it's already the best search he can do, given this structure. — Cynical, Dec 13 '18 at 09:03
@Cynical the real problem was the total time it takes to update a particular transaction. the update statement was taking the same time as the select statement and so I assumed that if the search was faster, I could bulk update faster. Please correct me if I'm wrong — Sujay DSa, Dec 13 '18 at 09:30
I'd guess that the index (primary) doesn't fit into memory, so you need to touch the drive. If you use InnoDB - check the buffer pool size. A query like this should't take more than 1ms on modern hardware. — Paul Spiegel, Dec 13 '18 at 09:52
@PaulSpiegel I ran select count(*) from information_schema.innodb_buffer_page; and got 393216. Then I ran select table_name as Table_Name, index_name as Index_Name, count(*) as Page_Count, sum(data_size)/1024/1024 as Size_in_MB from information_schema.innodb_buffer_page group by table_name, index_name order by Size_in_MB desc; and saw my index is 64 MB. If this is right, then it does look like my index is larger than buffer pool size — Sujay DSa, Dec 13 '18 at 10:56
To find out buffer pool size you should execute `select @@innodb_buffer_pool_size` — Paul Spiegel, Dec 13 '18 at 11:08

score 2 · Answer 1 · answered Dec 13 '18 at 09:11

I'm going to speculate a bit, given the information provided: your primary key is composed of an integer field account_id and a varchar one called transaction_id.

Since they're both components of the PRIMARY index created when you defined them as PRIMARY KEY(account_id, transaction_id), as they are they're the best you can have.

I think the bottleneck here is the transaction_id: as a string, it requires more effort to be indexed, and to be searched for. Changing its type to a different, easier to search one (i.e. numbers) would probably help.

The only other improvement I see is to simplify the PRIMARY KEY itself, either by removing the account_id field (it seems useless to me, since the transaction_id tastes like being unique, but that depends on your design) or by substitute the whole key with an integer, AUTO INCREMENT value (not recommended).

Unfortunately, I can't change the transaction id data type. It's also not unique. — Sujay DSa, Dec 13 '18 at 09:32
That sounds painfully wrong, but then you have no other choice than keeping the double key. On your other comment, you should check which part of the update takes time: if it is the selection of the row to update, then you're asking the right question, otherwise you have to investigate more. — Cynical, Dec 13 '18 at 09:35

Improving select statement performance

1 Answers1