I want to join against a huge partitioned table. The planner probably assumes that the partitioned table is very cheap to scan.
I have the following query:
select *
from (
select users where age < 18 limit 10
) as users
join
clicks on users.id = clicks.userid
where
clicks.ts between '2015-01-01' and now();
The table clicks
is the master table with roughly 40 child tables containing together about 40 million records.
This query performs very slow. When I look at the planner postgres first performs a complete scan of the clicks
table and then scans the user table.
However when I limit the users
subquery to 1 the planner first scans the users and then the clicks.
It seems as if the planner assumes that the clicks
table is very lightweight. If I look at the stats in pg_class
the master table clicks
has 0 tuples. Which is true on the one hand because it is a master table, but on the other hand, for the planner it should contain the sum of all its child tables.
How can I force the planner to use the cheapest option first?
edit: in simplifying the query I indeed missed out an additional constraint on the date.
The partitioning constraints are on: clicks.ts
and clicks.userid
I have indexes on users.age
, user.id
, clicks.userid
and clicks.ts
Maybe I have to trust the planner. I am just a little insecure because I once had a case where postgres showed some weird behavior with limits (PostgreSQL query very slow with limit 1).