2

I am working with MySQL with this table. Here is the schema.

create table if not exists users
(
    id            bigint unsigned auto_increment
        primary key,
    created_at    datetime(3)  null,
    updated_at    datetime(3)  null,
    deleted_at    datetime(3)  null,
    email         varchar(255) not null,
    password_hash varchar(191) not null,
    constraint idx_users_email
        unique (email)
);

create index idx_email_pass
    on users (email, password_hash);

create index idx_users_deleted_at
    on users (deleted_at);

When I run this query

EXPLAIN SELECT * FROM `users` WHERE `users`.`email` = 'name@email.org' AND `users`.`password_hash` = 'anewpassword' AND `users`.`deleted_at` IS NULL ORDER BY `users`.`id` LIMIT 1

(This query, and the create table/index query and all that is generated by golang/gorm package, which is my ORM.)

The result is

+----+-------------+-------+------------+-------+-----------------------------------------------------+-----------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys                                       | key             | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-----------------------------------------------------+-----------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | users | NULL       | const | idx_users_email,idx_users_deleted_at,idx_email_pass | idx_users_email | 1022    | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+-----------------------------------------------------+-----------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

This possibly means that it is not utilizing the specialized and more appropriate index idx_email_pass, in favor of idx_users_email.

Is the gorm-added query field deleted_at IS NULL is the one creating problems? I don't think so.

Why? I tried EXPLAIN <query> without deleted_at IS NULL part, and it still picked the same index.

I created an index idx_email_pass_del, but that didn't get picked with the original query either.


So I want to understand the following:

  1. Will the correct index utilized when data grows larger? I guess I should say that at this point, there are like 5 rows in database.

  2. Are two separate indexes (A,B,C) and D index sufficient for a query WHERE A=a AND B=b AND C=c AND D IS NULL?

Shadow
  • 33,525
  • 10
  • 51
  • 64
Geet Patel
  • 21
  • 2
  • As for your second question: I wouldn't think so. An index on `(A,B,C)` allows you to find the rows where `A=a AND B=b AND C=c` efficiently. An index on `D` allows you to find the rows where `D IS NULL` efficiently. Still, intersecting these two sets of rows could be inefficient if both are large. If you want a query which tests `A, B, C, D` against certain values to be guaranteed to be efficient, use an index on `A, B, C, D`. – Luatic Aug 05 '23 at 15:26

1 Answers1

0

As you have a unique constraint on email the index on the email plus other columns is pointless - as both indexes will have an identical number of rows.

As the unique constraint index is smaller the optimiser will almost always prefer it over the other index

NickW
  • 8,430
  • 2
  • 6
  • 19
  • This answer is incorrect. You can create multi-cilumn indexes and mysql will use those if those are covering indexes over a unique index. – Shadow Aug 05 '23 at 17:12
  • Only if you are only selecting the columns used in the index i.e. all the data required is found in the index. However the question has SELECT * and therefore the index is “just” identifying the rows to be selected - in which case the smaller index is more efficient – NickW Aug 05 '23 at 17:58
  • @Shadow both your comments are not correct. In any case MySQL select the unique index. See [here](https://dbfiddle.uk/pxgpX3Gy) – Ergest Basha Aug 05 '23 at 18:58
  • @ErgestBasha your example has nithing to do with my comment – Shadow Aug 05 '23 at 19:17
  • @Shadow *You can create multi-column indexes and MySQL will use those if those are covering indexes over a unique index* , in the example it doesn't. It chooses the unique index – Ergest Basha Aug 05 '23 at 19:30