0

I want to search DISTINCT on a Postgresql 9.4 table with about 300 000 records. It takes almost 8 seconds. I read on this post that using this could speed it up. And it really did. Down to 0.26 sec.

SELECT COUNT(*) FROM (SELECT DISTINCT column_name FROM table_name) AS temp;

Is much faster than

COUNT(DISTINCT(column_name))

When I write this I get the result but I want to add a WHERE clause.

This works but takes over 7 sec.

SELECT COUNT(DISTINCT(species)) FROM darwincore2
WHERE darwincore2.dataeier ILIKE '%nnog%'

This works (0.26 sec.) but fails when I add the WHERE clause:

SELECT COUNT(*) FROM (SELECT DISTINCT species FROM darwincore2) as temp
WHERE darwincore2.dataeier ILIKE '%nnog%'

with:

ERROR:  missing FROM-clause entry for table "darwincore2"

Anyone know how I can fix this? Or am I trying to do something that does not work??

Community
  • 1
  • 1
Asle
  • 767
  • 2
  • 8
  • 22
  • `distinct` is ***NOT*** a function. `distinct(column_name)` makes no sense –  Nov 27 '15 at 15:49

1 Answers1

2

The WHERE clause should be in the subquery:

SELECT COUNT(*) 
FROM (
    SELECT DISTINCT species 
    FROM darwincore2
    WHERE darwincore2.dataeier ILIKE '%nnog%'
    ) as temp
klin
  • 112,967
  • 15
  • 204
  • 232
  • Thanks @klin. This worked just great and I guess I can add any filtering inside the FROM(....) section? – Asle Nov 30 '15 at 09:49
  • A subquery is a normal query, so you can filter data as you want. Treat a subquery as a virtual table, from which you can select data with an outer query. Btw, such a subquery is formally known as *derived table*. – klin Nov 30 '15 at 10:01