I've come across a few questions that are somewhat related to mine (e.g. 1, 2). None of them really answer the question though.
Basically my problem is that Postgres refuses to use an index in certain seemingly simple queries. I have managed to boil down my complex query to some examples below:
-- SETUP
CREATE TEMP TABLE test (
id BIGSERIAL PRIMARY KEY,
DATA INT
);
INSERT INTO test(DATA)
SELECT RANDOM() FROM GENERATE_SERIES(1, 100000);
-- just in case the stats are outdated
VACUUM ANALYSE test;
-- QUERIES
-- correct but does a full table scan
EXPLAIN SELECT
id,
DATA
FROM
test
WHERE
(FALSE
OR id IN (
SELECT
UNNEST(ARRAY[100])
));
-- simpler than above but still does a table scan
EXPLAIN SELECT
id,
DATA
FROM
test
WHERE
FALSE
OR id IN (SELECT 100);
Output of EXPLAIN:
QUERY PLAN
----------------------------------------------------------
Seq Scan on test (cost=0.01..1791.01 rows=50000 width=12)
Filter: (hashed SubPlan 1)
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=4)
-- uses PRIMARY KEY index correctly but limited to 1 ID
-- NOTE that it uses "=" instead of "IN"
EXPLAIN SELECT
id,
DATA
FROM
test
WHERE
FALSE
OR id = (SELECT 100 LIMIT 1);
QUERY PLAN
---------------------------------------------------------------------
Index Scan using test_pkey on test (cost=0.30..8.32 rows=1 width=12)
Index Cond: (id = $0)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.01 rows=1 width=4)
-> Result (cost=0.00..0.01 rows=1 width=4)
uses PRIMARY KEY index correctly but I require a LEFT JOIN not an INNER JOIN. -- See next example.
EXPLAIN
WITH ids AS (
SELECT
UNNEST(ARRAY[100]) AS ids )
SELECT
id,
DATA
FROM
test
INNER JOIN ids ON
id = ids.ids;
QUERY PLAN
---------------------------------------------------------------------------
Nested Loop (cost=0.29..8.34 rows=1 width=12)
-> ProjectSet (cost=0.00..0.02 rows=1 width=4)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Index Scan using test_pkey on test (cost=0.29..8.31 rows=1 width=12)
Index Cond: (id = (unnest('{100}'::integer[])))
I need a left join but this does a full table scan
EXPLAIN WITH ids AS (
SELECT 100 AS id
)
SELECT
test.id,
DATA
FROM
test
LEFT JOIN ids ON
test.id = ids.id;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on test (cost=0.00..1541.00 rows=100000 width=12)
So my questions:
- Why does
FALSE OR ... IN ...
prevent the use of an index, whileFALSE OR ... = ...
is fine? - Why does the INNER JOIN use an index but LEFT JOIN does not in my example?
- Is there some way to rewrite the first query without using UNION or duplicating the whole query? (My actual query is much more complex)