2

I have a very simple SQL:

select * from email.email_task where acquire_time < now() and state IN ('CREATED', 'RELEASED') order by creation_time asc limit 1;

I have 2 indexes created:

  1. Index of state
  2. Index of state, acquire_time, creation_time

Ideally I think Postgres should pick the 2nd one since it matches every column required in this SQL:

However the execution plan shows differently, it uses neither of the indexes:

Limit  (cost=187404.36..187404.36 rows=1 width=743)
   ->  Sort  (cost=187404.36..190753.58 rows=1339690 width=743)
         Sort Key: creation_time
         ->  Seq Scan on email_task  (cost=0.00..180705.91 rows=1339690 width=743)
               Filter: (((state)::text = 'CREATED'::text) AND (acquire_time < now()))

I understand that if the number of rows returned arrives like 10% of total, then it would pick Seq Scan over Index Scan. (As explained at Why does PostgreSQL perform sequential scan on indexed column? ) So that's why index1 is not picked.

What I don't understand is why index2 is not picked since matches all the columns?

Then I tried a 3rd index:

  1. Index of create_time, acquire_time, state

And this time it uses the index3 (I add the index using another smaller database perf_1 because the original one has 2 million rows and it takes too much time)

Limit  (cost=0.29..0.36 rows=1 width=75) (actual time=0.043..0.043 rows=1 loops=1)
   ->  Index Scan using perf_1 on email_task  (cost=0.29..763.76 rows=9998 width=75) (actual time=0.042..0.042 rows=1 loops=1)
         Index Cond: (acquire_time < now())
         Filter: ((state)::text = ANY ('{CREATED,RELEASED}'::text[]))

It seems that, Postgres execution planner is picking the order by clause first then the where clause which is a little bit counter-intuitive.

Is my understanding correct or there are some other factors that impact the Postgres planner?

Thanks in advance.

abuuu
  • 367
  • 3
  • 14
  • The last query plan using an index of `acquire_time ` not any of what you claim you have indexes made. Also, the plan does not contain `Sort` condition, so for that plan, no `ORDER BY` was used. Meaning, third query plan is not about the query you provided. – Kristo Mägi Jul 20 '17 at 00:58
  • For the second index not used, is because 2 of columns are in WHERE and 1 is at ORDER BY that you have an index for. In order 2nd index to kick in all conditions have to exist at the same 'level'. Meaning, e.g. all 3 columns are at WHERE. – Kristo Mägi Jul 20 '17 at 01:01
  • Hi @KristoMägi. Thanks for the comment, The 3rd query plan used an index of "perf_1" which is (creation_time, acquire_time, status). The acquire_time is actually the column name. I confirmed that Order By is used in the SQL: `explain analyze select * from email.email_task where acquire_time < now() and state IN ('CREATED', 'RELEASED') order by creation_time asc limit 1` – abuuu Jul 20 '17 at 03:41

0 Answers0