1

I want to join against a huge partitioned table. The planner probably assumes that the partitioned table is very cheap to scan.

I have the following query:

select * 
from (
        select users where age < 18 limit 10
    ) as users 
    join 
    clicks on users.id = clicks.userid
where
    clicks.ts between '2015-01-01' and now();

The table clicks is the master table with roughly 40 child tables containing together about 40 million records.

This query performs very slow. When I look at the planner postgres first performs a complete scan of the clicks table and then scans the user table.

However when I limit the users subquery to 1 the planner first scans the users and then the clicks.

It seems as if the planner assumes that the clicks table is very lightweight. If I look at the stats in pg_class the master table clicks has 0 tuples. Which is true on the one hand because it is a master table, but on the other hand, for the planner it should contain the sum of all its child tables.

How can I force the planner to use the cheapest option first?


edit: in simplifying the query I indeed missed out an additional constraint on the date.

The partitioning constraints are on: clicks.ts and clicks.userid

I have indexes on users.age, user.id, clicks.userid and clicks.ts

Maybe I have to trust the planner. I am just a little insecure because I once had a case where postgres showed some weird behavior with limits (PostgreSQL query very slow with limit 1).

Community
  • 1
  • 1
pat
  • 2,600
  • 4
  • 20
  • 21

0 Answers0