1

I have a table with a bunch of boolean columns. I'd like to rank these columns by the count of true values each one has.

I found a way to count the number of true values in a column using:

SELECT count(CASE WHEN col1 THEN 1 ELSE null END) as col1,
       count(CASE WHEN col2 THEN 1 ELSE null END) as col2
       ....
FROM my_table;

but this approach has two problems:

  1. I have to manually type the names of the columns
  2. I have to then transpose the result and order by value

Is there a way to do the whole operation one query?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Fra
  • 4,918
  • 7
  • 33
  • 50

2 Answers2

1

This is not actually a crosstab job (or "pivot" in other RDBMS), but the reverse operation, "unpivot" if you will. One elegant technique is a VALUES expression in a LATERAL join.

The basic query can look like this, which takes care of:

  1. I have to then transpose the result and order by value
SELECT c.col, c.ct
FROM  (
   SELECT count(col1 OR NULL) AS col1
        , count(col2 OR NULL) AS col2
          -- etc.
   FROM   tbl
   ) t
, LATERAL (
   VALUES
     ('col1', col1)
   , ('col2', col2)
   -- etc.
   ) c(col, ct)
ORDER  BY 2;

That was the simple part. Your other request is harder:

  1. I have to manually type the names of the columns

This function takes your table name and retrieves meta data from the system catalog pg_attribute. It's a dynamic implementation of the above query, safe against SQL injection:


CREATE OR REPLACE FUNCTION f_true_ct(_tbl regclass)
  RETURNS TABLE (col text, ct bigint)
  LANGUAGE plpgsql AS
$func$
BEGIN
   RETURN QUERY EXECUTE (
   SELECT format('
      SELECT c.col, c.ct
      FROM  (SELECT %s FROM tbl) t
           , LATERAL (VALUES %s) c(col, ct)
      ORDER  BY 2 DESC'
    , string_agg (format('count(%1$I OR NULL) AS %1$I', attname), ', ')
    , string_agg (format('(%1$L, %1$I)', attname), ', ')
      )
   FROM   pg_attribute
   WHERE  attrelid = _tbl             -- valid, visible, legal table name 
   AND    attnum >= 1                 -- exclude tableoid & friends
   AND    NOT attisdropped            -- exclude dropped columns
   AND    atttypid = 'bool'::regtype  -- only character types
   );
END
$func$;

Call:

SELECT * FROM f_true_ct('tbl');  -- table name optionally schema-qualified

Result:

 col  | ct
------+---
 col1 | 3
 col3 | 2
 col2 | 1

Works for any table to rank all boolean columns by their count of true values.

To understand the function parameter, read this:

Related answers with more explanation:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

If I understand correctly, you can do this with a giant union all:

select c.*
from ((select 'col1' as which, sum(case when col1 then 1 else 0 end) as cnt from t
      ) union all
      (select 'col2' as which, sum(case when col2 then 1 else 0 end) as cnt from t
      ) union all
      . . .
     ) c
order by cnt desc;

Although you still need to type the results, this does sidestep the transpositions.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786