Why does distinct on in subqueries hurt performance in PostgreSQL?

Question

I've got a table users with fields id and email. id is the primary key and email is indexed as well.

database> \d users
+-----------------------------+-----------------------------+-----------------------------------------------------+
| Column                      | Type                        | Modifiers                                           |
|-----------------------------+-----------------------------+-----------------------------------------------------|
| id                          | integer                     |  not null default nextval('users_id_seq'::regclass) |
| email                       | character varying           |                                                     |
+-----------------------------+-----------------------------+-----------------------------------------------------+
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "index_users_on_email" UNIQUE, btree (email)

If I query the table with a distinct on (email) clause in a subquery I get a significant performance penalty.

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (email)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Subquery Scan on t  (cost=8898.69..10077.84 rows=337 width=4) (actual time=221.133..250.782 rows=1 loops=1)                 |
|   Filter: (t.id = 123)                                                                                                      |
|   Rows Removed by Filter: 67379                                                                                             |
|   Buffers: shared hit=2824, temp read=288 written=289                                                                       |
|   ->  Unique  (cost=8898.69..9235.59 rows=67380 width=24) (actual time=221.121..247.582 rows=67380 loops=1)                 |
|         Buffers: shared hit=2824, temp read=288 written=289                                                                 |
|         ->  Sort  (cost=8898.69..9067.14 rows=67380 width=24) (actual time=221.120..239.573 rows=67380 loops=1)             |
|               Sort Key: users.email                                                                                         |
|               Sort Method: external merge  Disk: 2304kB                                                                     |
|               Buffers: shared hit=2824, temp read=288 written=289                                                           |
|               ->  Seq Scan on users  (cost=0.00..3494.80 rows=67380 width=24) (actual time=0.009..9.714 rows=67380 loops=1) |
|                     Buffers: shared hit=2821                                                                                |
| Planning Time: 0.243 ms                                                                                                     |
| Execution Time: 251.258 ms                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

Compare this with distinct on (id) whose cost is less than one thousandth of the previous query.

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (id)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Unique  (cost=0.29..8.31 rows=1 width=4) (actual time=0.021..0.022 rows=1 loops=1)                                          |
|   Buffers: shared hit=3                                                                                                     |
|   ->  Index Only Scan using users_pkey on users  (cost=0.29..8.31 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1) |
|         Index Cond: (id = 123)                                                                                              |
|         Heap Fetches: 1                                                                                                     |
|         Buffers: shared hit=3                                                                                               |
| Planning Time: 0.090 ms                                                                                                     |
| Execution Time: 0.034 ms                                                                                                    |
+-----------------------------------------------------------------------------------------------------------------------------+

Why is this?

The real problem I'm having is that I'm trying to create a view that does distinct on an indexed column that isn't unique and the performance is very bad.

You're forcing the server to do a bogus `distinct` since you're going to discard the results. In the last case the server was able to detect this since only the unique `id` column is involved and simply discard that `DISTINCT` operation. In the previous case it *had* to calculate `distinct on` to find which `ID` values are returned before it can filter them — Panagiotis Kanavos, Jul 15 '20 at 15:36
The two queries return different results. The first one may return *no* results if there is an `id` less than `123` for an email. In the second case there's always a result as a `DISTINCT` on a `UNIQUE` index just returns the indexed values — Panagiotis Kanavos, Jul 15 '20 at 15:39
I've updated the question using `explain (analyze, buffers)` — The Hoff, Jul 15 '20 at 15:58

Erwin Brandstetter · Accepted Answer · 2022-04-05T18:59:52.957

Logical difference

Both columns id and email are UNIQUE. But only id is NOT NULL. (PRIMARY KEY columns always are.) NULL values are not considered equal, multiple NULL values are allowed in a column with UNIQUE constraint (or index). That's according to standard SQL. See:

Allow null in unique column

But DISTINCT or DISTINCT ON consider NULL values equal. The manual:

Obviously, two rows are considered distinct if they differ in at least one column value. Null values are considered equal in this comparison.

Bold emphasis mine. Further reading:

Select first row in each GROUP BY group?

In your second query, distinct on (id) is a logical no-op: the result is guaranteed to be the same as without DISTINCT ON. And since the outer SELECT filters on id = 123, Postgres can strip the noise and work off a very cheap index-only scan.

In your first query, on the other hand, distinct on (email) might actually do something if there are more than one rows with email IS NULL. Then Postgres has to pick the first id according to the given sort order. Since there is no ORDER BY, it results in an arbitrary pick. But the outer SELECT with the predicate where id = 123 may depend on the result. The whole query is different from the first on principle - and broken by design.

Serendipity

All that aside, two "lucky" findings stick out:

Sort Method: external merge  Disk: 2304kB

Mention of "Disk" indicates insufficient work_mem. See:

Configuration parameter work_mem in PostgreSQL on Linux

          ->  Seq Scan on users  (cost=0.00..3494.80 rows=67380

In my tests, I always got an index scan here. Indicates a bloated index or some other problem with your setup.

Useful comparison?

The comparison is going nowhere. We might learn something from comparing the first query with this query - after switching roles of PK and UNIQUE column:

select email
from  (select distinct on (id) email from users) t
where email = 'user123@foo.com';

Or from comparing the second query with this one - trying the same with the UNIQUE column instead of the PK column:

select email
from  (select distinct on (email) email from users) t
where email = 'user123@foo.com';

We learn that PK and UNIQUE constraint show no different effect on query plans. Postgres does not use the meta-information to cut corners. The PK would actually make a difference with GROUP BY. See:

Unexpected behaviour for GROUP BY query

So this works:

SELECT email
FROM  (
   SELECT email -- no aggregate required, because id = PK
   FROM   users
   GROUP  BY id  -- !
   ) t
WHERE email = 'user123@foo.com';

But the same does not work after switching id and email. I added some demos to the fiddle:

db<>fiddle here

So?

Both queries are nonsense for different reasons. I don't see how they can help with your real query:

The real problem I'm having is that I'm trying to create a view that does distinct on an indexed column that isn't unique and the performance is very bad.

We'd need to see your real query - and all other relevant details of your setup. There may be solutions, but that may well beyond the scope of a question on SO. Consider hiring a consultant. Or consider one of these methods to optimize performance:

Optimize GROUP BY query to retrieve latest row per user

Why does distinct on in subqueries hurt performance in PostgreSQL?

1 Answers1

Logical difference

Serendipity

Useful comparison?

So?