2

I have a table with about 10 million rows in it and an index on a date field. When I try and extract the unique values of the indexed field Postgres runs a sequential scan even though the result set has only 26 items. Why is the optimiser picking this plan? And what can I do avoid it?

explain select "labelDate" from pages group by "labelDate";
                              QUERY PLAN
-----------------------------------------------------------------------
 HashAggregate  (cost=524616.78..524617.04 rows=26 width=4)
   Group Key: "labelDate"
   ->  Seq Scan on pages  (cost=0.00..499082.42 rows=10213742 width=4)
(3 rows)
Charlie Clark
  • 18,477
  • 4
  • 49
  • 55

1 Answers1

1

I think your problem here is that the query planner wants to read the whole table because you have a GROUP BY clause even though you do not use any aggregate function. It therefore looks similar to the issue of "Why is count(*) so slow" which you will find in many forms in questions.

In your case, the query is a bit odd. Your question is answered with this simple query:

SELECT DISTINCT "labelDate" FROM pages;
Community
  • 1
  • 1
Patrick
  • 29,357
  • 6
  • 62
  • 90