Optimize setup for read performance
Your memory settings seem reasonable for a 64GB server - except maybe work_mem = 512MB
. That's high. Your queries are not particularly complex and your tables are not that big.
4.5 million rows (300k x 15) in the simple junction table documents_tags
should occupy ~ 156 MB and the PK another 96 MB. For your query you typically don't need to read the whole table, just small parts of the index. For "mostly read-only" like you have, you should see index-only scans on the index of the PK exclusively. You don't need nearly as much work_mem
- which may not matter much - except where you have many concurrent queries. Quoting the manual:
... several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem
; it is necessary to keep this fact in mind when choosing the value.
Setting work_mem
too high may actually impair performance:
I suggest to reduce work_mem
to 128 MB or less to avoid possible memory starvation- unless you have other common queries that require more. You can always set it higher locally for special queries.
There are several other angles to optimize read performance:
Key problem: leading index column
All of this may help a little. But the key problem is this:
PRIMARY KEY (document_id, tag_id)
300k documents, 2 tags to exclude. Ideally, you have an index with tag_id
as leading column and document_id
as 2nd. With an index on just (tag_id)
you can't get index-only scans. If this query is your only use case, change your PK as demonstrated below.
Or probably even better: you can create an additional plain index on (tag_id, document_id)
if you need both - and drop the two other indexes on documents_tags
on just (tag_id)
and (document_id)
. They offer nothing over the two multicolumn indexes. The remaining 2 indexes (as opposed to 3 indexes before) are smaller and superior in every way. Rationale:
While being at it, I suggest to also CLUSTER
the table using the new PK, all in one transaction, possibly with some extra maintenance_work_mem
locally:
BEGIN;
SET LOCAL maintenance_work_mem = '256MB';
ALTER TABLE documents_tags
DROP CONSTRAINT documents_tags_pkey
, ADD PRIMARY KEY (tag_id, document_id); -- tag_id first.
CLUSTER documents_tags USING documents_tags_pkey;
COMMIT;
Don't forget to:
VACUUM ANALYZE documents_tags;
Queries
The query itself is run-of-the-mill. Here are the 4 standard techniques:
NOT IN
is - to quote myself:
Only good for small sets without NULL values
Your use case exactly: all involved columns NOT NULL
and your list of excluded items is very short. Your original query is a hot contender.
NOT EXISTS
and LEFT JOIN / IS NULL
are always hot contenders. Both have been suggested in other answers. LEFT JOIN
has to be an actual LEFT [OUTER] JOIN
, though.
EXCEPT ALL
would be shortest, but often not as fast.
1. NOT IN
SELECT document_id
FROM documents d
WHERE document_id NOT IN (
SELECT document_id -- no need for column alias, only value is relevant
FROM documents_tags
WHERE tag_id IN (12345, 7654)
);
2. NOT EXISTS
SELECT document_id
FROM documents d
WHERE NOT EXISTS (
SELECT FROM documents_tags
WHERE document_id = d.document_id
AND tag_id IN (12345, 7654)
);
3. LEFT JOIN
/ IS NULL
SELECT d.document_id
FROM documents d
LEFT JOIN documents_tags dt ON dt.document_id = d.document_id
AND dt.tag_id IN (12345, 7654)
WHERE dt.document_id IS NULL;
4. EXCEPT ALL
SELECT document_id
FROM documents
EXCEPT ALL -- ALL keeps duplicate rows and makes it faster
SELECT document_id
FROM documents_tags
WHERE tag_id IN (12345, 7654);
Benchmark
To put my theories to the test.
Test setup
SET random_page_cost = 1.1;
SET work_mem = '128MB';
CREATE TABLE documents (
document_id serial PRIMARY KEY
, title text
);
-- CREATE TABLE tags ( ... -- irrelevant for this query
CREATE TABLE documents_tags (
document_id int REFERENCES documents
, tag_id int -- REFERENCES tags -- irrelevant for test
-- no PK yet, to test seq scan
-- it's also faster to create the PK after filling the big table
);
INSERT INTO documents (title)
SELECT 'some dummy title ' || g
FROM generate_series(1, 300000) g;
INSERT INTO documents_tags(document_id, tag_id)
SELECT i.*
FROM documents d
CROSS JOIN LATERAL (
SELECT DISTINCT d.document_id, ceil(random() * 50000)::int
FROM generate_series (1,30)
WHERE random() > .5
) i;
ALTER TABLE documents_tags ADD PRIMARY KEY (document_id, tag_id); -- current idx
VACUUM ANALYZE documents_tags;
VACUUM ANALYZE documents;
Note that rows in documents_tags
are physically clustered by document_id
due to the way I filled the table - which is likely your current situation as well.
Original Test with Postgres 9.5 on 2016-08-06
On my old laptop with 4 GB RAM.
3 test runs with each of the 4 queries, best of 5 every time to exclude caching effects.
Test 1: With documents_tags_pkey
like you have it. Index and physical order of rows are bad for our query.
Test 2: Recreate the PK on (tag_id, document_id)
like suggested.
Test 3: CLUSTER
on new PK.
Execution time of EXPLAIN ANALYZE
in ms:
time in ms | Test 1 | Test 2 | Test 3
---------------+--------+--------+-------
1. NOT IN | 654 | 70 | 71 -- winner!
2. NOT EXISTS | 684 | 103 | 97
3. LEFT JOIN | 685 | 98 | 99
4. EXCEPT ALL | 833 | 255 | 250
Conclusions
Key element is the right index with leading tag_id
- for queries involving few tag_id
and many document_id
.
To be precise, it's not important that there are more distinct document_id
than tag_id
. This could be the other way round as well. Btree indexes basically perform the same with any order of columns. It's the fact that the most selective predicate in your query filters on tag_id
. And that's faster on the leading index column(s).
The winning query for few tag_id
to exclude is your original with NOT IN
.
NOT EXISTS
and LEFT JOIN / IS NULL
result in the same query plan. For more than a couple of dozen IDs, I expect these to scale better ...
In a read-only situation you'll see index-only scans exclusively, so the physical order of rows in the table becomes irrelevant. Hence, test 3 did not bring any more improvements.
If writes to the table happen and autovacuum
can't keep up, you'll see (bitmap) index scans. Physical clustering is important for those.
Retest with Postgres 15 on 2023-04-11
With plenty of work_mem
on a newer laptop with 64 GB RAM.
I skipped "Test 2", and added "Test 1+" instead. That's the same as Test 1, but with increased max_parallel_workers_per_gather
so that 3 parallel workers were used instead of just 2.
time in ms | Test 1 | Test 1+ | Test 3
---------------+--------+---------+-------
1. NOT IN | 170 | 154 | 42
2. NOT EXISTS | 153 | 132 | 35 -- winner!
3. LEFT JOIN | 154 | 135 | 35 -- same query plan
4. EXCEPT ALL | 276 | 254 | 137
As expected, performance generally improved.
NOT IN
could not keep up, since it poses some inherent optimization barriers. The clear winner is now NOT EXISTS
or LEFT JOIN
/ NOT NULL
, using all index-only scans.
"Test 1" and "Test 1+" look better than they were. Those used 2 and 3 workers, so multiply the numbers by that factor go get actual system load.