pgstattuple query performance

Question

I am trying to build query that checks dead tuples, live tuples percent, free space etc. in all my users tables using pgstattuple extension. So, I came up with this query:

SELECT schemaname, 
relname, 
(pgstattuple(schemaname ||'.'||relname)).*
FROM pg_statio_user_tables;

pg_statio_user_tables is a pg_catalog view, that uses another system view. So at first glance the query looks quite simple, but it ended up with a total execution time of 27 minutes for 882 objects in db.

The execution plan for this query:

Is there a way to rewrite query still using pgstattuple so it'll perform decent?

score 3 · Accepted Answer · edited May 23 '17 at 11:57

It gets a lot faster if you run the function in a subquery and only decompose the resulting row type on the next level:

SELECT s.schemaname, s.relname, st.*
FROM pg_statio_user_tables s
    ,pgstattuple(s.schemaname ||'.'|| s.relname) AS st;

(That's an implicit LATERAL JOIN. In versions befor Postgres 9.3 use a subquery instead.)

This way, the function is only called once per row in pg_statio_user_tables. Unfortunately you are stumbling over a weak spot in the Postgres query planner (parser). Craig explains in more details in this related answer:

How to avoid multiple function evals with the (func()).* syntax in an SQL query?

And if all you need from the pg_statio_user_tables view are table and schema name, you can have that much faster, yet:

SELECT c.oid::regclass::text AS tbl, (st).*
FROM   pg_class c
JOIN   pg_namespace n ON n.oid = c.relnamespace
      ,pgstattuple(c.oid::regclass::text) st
WHERE  c.relkind = 'r'            -- only tables
AND    n.nspname NOT LIKE 'pg_%'  -- exclude system schemas
ORDER  BY 1;

my query needed 561486.985 ms (~9,4min), yours with lateral join: 155152.465 ms (2,6min), with pg_class 169894.413 ms (2,8min). Still far from speedy Gonzales, but much better :) — Borys, Aug 04 '14 at 13:37

pgstattuple query performance

1 Answers1