2

Extended Statistics help the planner with evaluating the query complexity in order to choose the correct algorithm for processing the query. Creating new statistics on a couple of dependant columns in a table probably creates some complex structure somewhere in the database because they have to be refreshed by ANALYZ(E)ing the table. There's a table pg_statistics_ext_data which I can't even view because my master (and only) account does not have the rights to SELECT.

What am I asking is, are they any trade offs to these statistics? Do they take up huge amounts of data on the disk? Can I somehow find out how much?

Why are they not used everytime no matter if I create them manually?

toddddos
  • 321
  • 2
  • 11
  • 1
    Just like the normal statistics, they are relatively lightweight (they *must* be). Per row there should not exist more than one page of data. If you are really interested, you could check the documentation and sourcecode, or experiment. https://www.postgresql.org/docs/11/planner-stats.html#PLANNER-STATS-EXTENDED – wildplasser May 04 '20 at 20:29

1 Answers1

1

The expense of maintaining and using these statistics grows with the number of columns. Imagine having to track the interdependencies of each pair of columns in a wide table!

The price for extended statistics has to be paid whenever PostgreSQL collects statistics and whenever a query is planned. That's why they should only be employed qhen necessary.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • got it, thank you. So if I create the statistics for just one pair of dependant columns - all there is is just one row in "pg_statistic_ext" and that's it? The problem comes when I have a lot of extended statistics? – toddddos May 05 '20 at 13:42
  • It is not a problem, but there is some overhead. The overhead may become a problem if you have a lot of extended statistics, particularly if `default_statistics_target` is high. – Laurenz Albe May 05 '20 at 13:46