Source of the error msg
One of the columns does not have data type you think it has. Must be operative_margin
, probably text
?
The 1-parameter form of crosstab()
only uses the "category" column (year
in your example) only for sorting. And the "row_name" column (dist_chann_name
- or dist_chann_id
?) would produce a different error msg.
Solution
Either way, unless you can guarantee that every "row_name" has exactly two values to it, it's safer to use the 2-parameter form of corosstab()
:
SELECT *
FROM crosstab(
$$
SELECT dist_chann_name, year, operative_margin
FROM marginality_by_channel
ORDER BY 1, 2
$$
, 'VALUES (2020), (2021)'
) AS ct ("DC" int, "2020" int, "2021" int);
db<>fiddle here
This variant also happens to be more tolerant with type mismatches (as everything is passed as text
anyway). See:
crosstab()
shines for many resulting value columns (faster, shorter). For just two "value" columns, aggregate FILTER
might be the better (simpler) choice. Not much performance to gain (if any, after adding some overhead). See:
Broken setup
That aside, your setup is ambiguous to begin with. It includes two rows for the same (dist_chann_name, year) = (3, 2021)
.
- a_horse uses
sum()
in his aggregate FILTER
solution. You might also use min()
or max()
, or whatever ...
- My solution with the 2-parameter form outputs the last value according to sort order. (Think of it as each next value overwriting it's dedicated spot.)
- The 1-parameter form outputs the first value according to sort order. (Think of it as "first come, first serve". Superfluous rows are discarded.)
A clean solution would use an explicit sort order and document the effect, or work with a query producing distinct values, or use the appropriate aggregate function with the FILTER
solution.