3

I am currently trying to improve a postgres query. The following is a minimal setup for reproducing the essence of my problem on postgres-12.6:

CREATE TABLE data_node (id BIGINT PRIMARY KEY);
CREATE TABLE data_entry (id BIGINT PRIMARY KEY, node_fk BIGINT NOT NULL, FOREIGN KEY (node_fk) REFERENCES data_node (id));
CREATE INDEX node_ix ON data_entry USING BTREE (node_fk);
INSERT INTO data_node (id) SELECT generate_series(1,10);
INSERT INTO data_entry (id, node_fk) SELECT s, 2 FROM generate_series(1,10000000) s;

I am trying to efficiently filter all entries of the small data_node table which are referenced by the large data_entry table. It is expected that only a subset of the nodes are referenced, and for some installation instances, only a single node might be referenced. The following query seems natural to do the job:

SELECT * FROM data_node
WHERE EXISTS (
 SELECT 1 FROM data_entry WHERE node_fk = data_node.id
);

After a VACUUM ANALYZE on both tables, this results in the following query plan:

 Merge Join  (cost=179055.16..179055.99 rows=1 width=8) (actual time=1895.155..1895.158 rows=1 loops=1)
   Merge Cond: (data_node.id = data_entry.node_fk)
   ->  Index Only Scan using data_node_pkey on data_node  (cost=0.14..8.29 rows=10 width=8) (actual time=0.004..0.008 rows=3 loops=1)
         Heap Fetches: 0
   ->  Sort  (cost=179055.02..179055.03 rows=1 width=8) (actual time=1895.143..1895.144 rows=1 loops=1)
         Sort Key: data_entry.node_fk
         Sort Method: quicksort  Memory: 25kB
         ->  HashAggregate  (cost=179055.00..179055.01 rows=1 width=8) (actual time=1895.135..1895.136 rows=1 loops=1)
               Group Key: data_entry.node_fk
               ->  Seq Scan on data_entry  (cost=0.00..154055.00 rows=10000000 width=8) (actual time=0.009..831.883 rows=10000000 loops=1)

In the shown minimal setup, the query takes seconds (and in the actual, less minimal application scenario, minutes), although, with use of the index on the larger table, it should be done in less than a millisecond. By setting SET enable_seqscan = false;, I was able to get postgres to use the "good" query plan:

 Nested Loop Semi Join  (cost=0.57..12.86 rows=1 width=8) (actual time=0.029..0.044 rows=1 loops=1)
   ->  Index Only Scan using data_node_pkey on data_node  (cost=0.14..8.29 rows=10 width=8) (actual time=0.003..0.004 rows=10 loops=1)
         Heap Fetches: 0
   ->  Index Only Scan using node_ix on data_entry  (cost=0.43..185308.04 rows=10000000 width=8) (actual time=0.004..0.004 rows=0 loops=10)
         Index Cond: (node_fk = data_node.id)
         Heap Fetches: 0
 Planning Time: 0.269 ms
 Execution Time: 0.068 ms

While this gives me what I want in my test setting, globally changing enable_seqscan is, of course, not a feasible solution for a production environment.

Is there any "clean" way to make sure that postgres performs the filtering of referenced small table entries using the "good" query plan, or a plan of comparable performance?

aschnab
  • 31
  • 1
  • I cannot reproduce that on my 13.2 database with standard configuration. I always get the good plan. – Laurenz Albe May 04 '21 at 01:49
  • @Abelisto My `shared_buffers` was 128MB. – Laurenz Albe May 04 '21 at 08:37
  • 2
    @LaurenzAlbe Ok, it was hard a bit to reproduce. https://dbfiddle.uk/?rdbms=postgres_13&fiddle=34e862d40dfb788fe74037f16ad67dc5 Seems like a bug. – Abelisto May 04 '21 at 09:10
  • 1
    @LaurenzAlbe @Abelisto For yesterday's tests, I used an existing postgres-12.6 database with `shared_buffers = 15980MB`. The other non-default memory usage parameters on that system were `work_mem = 100MB` and `maintenance_work_mem = 3196MB`. I just retried my experiments on complete fresh, default (i.e.: `shared_buffers = 128MB`) local installations of postgres-12.6 and postgres-13.2 on my windows machine, too. In both cases, the descriebed experiment immediately produced the bad query plan, too. – aschnab May 04 '21 at 11:10
  • 2
    I would report that to the postgres dev team. The very low cardinality of the index throws all queries out the window it seems. even something like `select distinct node_fk from data_entry where node_fk=2` does a full table scan... – JP Moresmau May 04 '21 at 16:08
  • @JPMoresmau Thanks, I just sent this to postgres' bug report mailing list. – aschnab May 05 '21 at 13:39
  • @LaurenzAlbe Just interesting: any changes about this issue? – Abelisto May 22 '21 at 17:39
  • 1
    @Abelisto You can read [the thread on -bugs](https://www.postgresql.org/message-id/flat/16993-1298e8dc9c41ab97%40postgresql.org). Tom thinks that it is a bug that is not easy to fix. – Laurenz Albe May 24 '21 at 04:01
  • @LaurenzAlbe Thanks for the link. As for me `exists` should to be handled in the very separate way. – Abelisto May 24 '21 at 07:51

0 Answers0