1

TIL about tablefunc and crosstab. At first I wanted to "group data by columns" but that doesn't really mean anything.

My product sales look like this

product_id | units   |  date
-----------------------------------
10         | 1          | 1-1-2018
10         | 2          | 2-2-2018
11         | 3          | 1-1-2018
11         | 10         | 1-2-2018
12         | 1          | 2-1-2018
13         | 10         | 1-1-2018
13         | 10         | 2-2-2018

I would like to produce a table of products with months as columns

product_id | 01-01-2018 | 02-01-2018 | etc.
-----------------------------------
10         | 1          | 2
11         | 13         | 0
12         | 0          | 1
13         | 20         | 0

First I would group by month, then invert and group by product, but I cannot figure out how to do this.

Blair Anderson
  • 19,463
  • 8
  • 77
  • 114

1 Answers1

1

After enabling the tablefunc extension,

SELECT product_id, coalesce("2018-1-1", 0) as "2018-1-1"
    , coalesce("2018-2-1", 0) as "2018-2-1"
FROM crosstab(
   $$SELECT product_id, date_trunc('month', date)::date as month, sum(units) as units
    FROM   test
    GROUP BY product_id, month
    ORDER BY 1$$ 

  , $$VALUES ('2018-1-1'::date), ('2018-2-1')$$
   ) AS ct (product_id int, "2018-1-1" int, "2018-2-1" int);

yields

| product_id | 2018-1-1 | 2018-2-1 |
|------------+----------+----------|
|         10 |        1 |        2 |
|         11 |       13 |        0 |
|         12 |        0 |        1 |
|         13 |       10 |       10 |
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • wow wow wow! that worked amazingly. is there a way to "dynamically generate the header months" instead of explicitly naming them. – Blair Anderson May 03 '19 at 19:07
  • See [Erwin Brandstetter's answer](https://stackoverflow.com/a/15514334/190597) (search for "polymorphic, dynamic function for all"). But even then, you would need to define a type or temporary table with the desired row type. That answer does not seem to show how to do that dynamically (after all, it would not be obvious in the general case what the type of each column should be...) Alternatively, you could use an auxiliary language such as Python to "dynamically" compose the SQL. I could show how to generate it with Python, but I'm not sure if that would be helpful to you. – unutbu May 03 '19 at 19:31
  • do you know why i have to `ORDER BY product_id, units` to get an accurate output? – Blair Anderson May 03 '19 at 19:58
  • Sorry, that was my mistake. Per [the docs](https://www.postgresql.org/docs/current/tablefunc.html), when using the 2-argument form of `crosstab`, "the `source_sql` query should always specify `ORDER BY 1` to ensure that values with the same row_name are brought together. However, ordering of the categories within a group is not important." Therefore, I've changed the answer to simply use `ORDER BY 1`, which is equivalent to `ORDER BY product_id`. – unutbu May 04 '19 at 00:56