Postgres is a performing a sequential scan instead of index scan

Question

I have a table with about 10 million rows in it and an index on a date field. When I try and extract the unique values of the indexed field Postgres runs a sequential scan even though the result set has only 26 items. Why is the optimiser picking this plan? And what can I do avoid it?

explain select "labelDate" from pages group by "labelDate";
                              QUERY PLAN
-----------------------------------------------------------------------
 HashAggregate  (cost=524616.78..524617.04 rows=26 width=4)
   Group Key: "labelDate"
   ->  Seq Scan on pages  (cost=0.00..499082.42 rows=10213742 width=4)
(3 rows)

I guess you would have better chance on http://dba.stackexchange.com/ — Luc M, Jun 30 '15 at 12:28
Oh, there's a dedicated forum for DB questions? I'll give it a go. — Charlie Clark, Jun 30 '15 at 12:36
This question has been answered on DBA. It turns out that there is a known issue in Postgres with indices with relatively few unique values in a large table. Seems counterintuitive but I actually know that this should be a foreign key dependency. https://wiki.postgresql.org/wiki/Loose_indexscan — Charlie Clark, Jun 30 '15 at 14:00

score 1 · Answer 1 · edited May 23 '17 at 12:14

1

I think your problem here is that the query planner wants to read the whole table because you have a GROUP BY clause even though you do not use any aggregate function. It therefore looks similar to the issue of "Why is count(*) so slow" which you will find in many forms in postgresql questions.

In your case, the query is a bit odd. Your question is answered with this simple query:

SELECT DISTINCT "labelDate" FROM pages;

edited May 23 '17 at 12:14

Community

1
1

answered Jun 30 '15 at 13:50

Patrick

29,357
6
62
90

That doesn't run any faster. As COUNT isn't used a scan shouldn't be required. – Charlie Clark Jun 30 '15 at 13:53

Postgres is a performing a sequential scan instead of index scan

1 Answers1