4

I have 30 numeric numeric columns in a table .I want to find mean,std, percentiles for all the columns in table.I don't want to write all the column names manually like below

select date,
      avg(col1), stddev(col1),
      avg(col2), stddev(col2), 
from table name group by date;

Is there any way to find mean, std, percentiles for all the columns at once.

Kaushik Nayak
  • 30,772
  • 5
  • 32
  • 45
user8545255
  • 761
  • 3
  • 9
  • 21
  • This answers most of your question: https://stackoverflow.com/questions/14316562/nth-percentile-calculations-in-postgresql ... not sure about STDEV. – Tim Biegeleisen May 21 '18 at 01:59
  • Others you will find here, [aggregate functions]:(https://www.postgresql.org/docs/9.1/static/functions-aggregate.html) – Mankind_008 May 21 '18 at 04:03
  • you have to write the column names and the aggregate functions explicitly. no way around that in sql. – Haleemur Ali May 21 '18 at 04:27
  • apache madlib supports postgresql. summary module should cover all statistics you want. http://madlib.apache.org/docs/latest/group__grp__summary.html – Sung Yu-wei May 21 '18 at 13:25
  • Greenplum or Postgres? They are very different (even though they share some common roots) –  May 21 '18 at 16:31
  • apache madlib supports both postgresql and greenplum – Sung Yu-wei May 21 '18 at 16:41

2 Answers2

7

You can simplify the logic using a lateral join:

select which, min(val), max(val), stddev(val), avg(val)
from t, lateral
     (values ('col1', col1), ('col2', col2), . . . 
     ) v(which, val)
group by which;

You still have to list the columns, but you only need to do so once in the values clause.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Dynamic SQL is a little bit trick in Greenplum.

Here is an example based on the instruction from https://www.pivotalguru.com/?p=266

$ psql postgres -c "create table foo (date date, c1 int, c2 int, c3 int);"
$ cat <<EOT >> /tmp/bar.sql
> select 'select ';
> select ' avg('  || attname || '), stddev(' || attname || '),' from pg_attribute
> where attrelid = 'foo'::regclass::oid and attnum > 0 and attname != 'date';
> select ' date from foo group by date;';
> EOT
$ psql -A -t  -f /tmp/foo.sql postgres | psql -a postgres
select 
 avg(c1), stddev(c1),
 avg(c2), stddev(c2),
 avg(c3), stddev(c3),
 date from foo group by date;
 avg | stddev | avg | stddev | avg | stddev | date 
-----+--------+-----+--------+-----+--------+------
Xin Zhang
  • 161
  • 1
  • 1
  • 10