I have a table with over a billion records. In order to improve performance, I partitioned it to 30 partitions. The most frequent queries have (id = ...)
in their where clause, so I decided to partition the table on the id
column.
Basically, the partitions were created in this way:
CREATE TABLE foo_0 (CHECK (id % 30 = 0)) INHERITS (foo);
CREATE TABLE foo_1 (CHECK (id % 30 = 1)) INHERITS (foo);
CREATE TABLE foo_2 (CHECK (id % 30 = 2)) INHERITS (foo);
CREATE TABLE foo_3 (CHECK (id % 30 = 3)) INHERITS (foo);
.
.
.
I ran ANALYZE
for the entire database and in particular, I made it collect extra statistics for this table's id
column by running:
ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;
However when I run queries that filter on the id
column the planner shows that it's still scanning all the partitions. constraint_exclusion
is set to partition
, so that's not the problem.
EXPLAIN ANALYZE SELECT * FROM foo WHERE (id = 2);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.544..215.540 rows=171477 loops=1)
-> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.539..106.446 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id = 2)
-> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.020..0.020 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.018..0.018 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.012..0.012 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.010..0.010 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=30.504..77.033 rows=171477 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=29.825..29.825 rows=171477 loops=1)
Index Cond: (id = 2)
.
.
.
What could I do to make the planer have a better plan? Do I need to run ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;
for all the partitions as well?
EDIT
After using Erwin's suggested change to the query, the planner only scans the correct partition, however the execution time is actually worse then a full scan (at least of the index).
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2);
QUERY PLAN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.611..224.934 rows=171477 loops=1)
-> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.606..116.565 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id = 2)
-> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.046..0.046 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.044..0.044 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.021..0.021 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.020..0.020 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=32.536..86.730 rows=171477 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=31.842..31.842 rows=171477 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_3 foo (cost=3475.87..285574.05 rows=129032 width=52) (actual time=0.035..0.035 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_3_idx_1 (cost=0.00..3443.61 rows=129032 width=0) (actual time=0.031..0.031 rows=0 loops=1)
.
.
.
-> Bitmap Heap Scan on foo_29 foo (cost=3401.84..276569.90 rows=126245 width=56) (actual time=0.019..0.019 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_29_idx_1 (cost=0.00..3370.28 rows=126245 width=0) (actual time=0.018..0.018 rows=0 loops=1)
Index Cond: (id = 2)
Total runtime: 238.790 ms
Versus:
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..273120.30 rows=611 width=56) (actual time=31.519..257.051 rows=171477 loops=1)
-> Append (cost=0.00..273120.30 rows=611 width=56) (actual time=31.516..153.356 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((id = 2) AND ((id % 30) = 2))
-> Bitmap Heap Scan on foo_2 foo (cost=3249.97..273120.30 rows=610 width=56) (actual time=31.512..124.177 rows=171477 loops=1)
Recheck Cond: (id = 2)
Filter: ((id % 30) = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=30.816..30.816 rows=171477 loops=1)
Index Cond: (id = 2)
Total runtime: 270.384 ms