7

I am new to SQL and was trying to do a crosstab in Postgres. I would have done it in Excel, but I have a database of around 3.5 million rows, 20,000 different values for code, 7 categories in cat, and variable values from 1 to 100. A code may only have few of the 7 categories.

Excel can't handle the number of rows, so SQL it is.

My data is in the form

code   |  cat        |   value |
--------------------------------
abc123 |   1         |    4    |
abc234 |   2         |    6    |
abc345 |   1         |    1    |
abc123 |   3         |    2    |
abc123 |   6         |    12   |

with code and cat as text, value as integer stored in a Postgres table.

I would like to perform a crosstab on code and cat, with sum of value. I would like it to show zero instead of 'null' in the return, but if 'null' would be simpler query, then that would be fine.

So the output I would like is

code   |   'cat=0' | 'cat=1' | 'cat=2' | 'cat=3' | 'cat=4' | 'cat=5' | 'cat=6'|
abc123 |    25     |  0      |  3      |  500    | 250     | 42      |  0     |
abc234 |     0     |  100    |  0      |   10    |  5      |  0      |   25   |
abc345 |    1000   |   0     |  0      |    0    |  0      |  0      |   0    |

I have searched on Postgres help files and other forums; the closest thing was the SO question PostgreSQL Crosstab Query but I couldn't figure out how to sum the values from third column.

Any assistance would be greatly appreciated.

Community
  • 1
  • 1
Siraj
  • 71
  • 1
  • 1
  • 2

2 Answers2

9

I got this working by updating my code to the following:

select * from crosstab(
'select code, cat, sum(value) as value
 from my_table 
 group by code, cat
 order by 1,2'
) as ct(code varchar(255),
    cat_0 bigint,
    cat_1 bigint, 
    cat_2 bigint, 
    cat_3 bigint, 
    cat_4 bigint, 
    cat_5 bigint, 
    cat_6 bigint)

I was able to determine the right data type by running the select statement inside the crosstab and matching my as ct data types to those returned by the query inside the crosstab.

3

Try:

select * from crosstab(
'select code, cat, sum(value) as value
 from my_table 
 group by code, cat
 order by 1,2'
) as ct(code text,
        cat_0 int,
        cat_1 int, 
        cat_2 int, 
        cat_3 int, 
        cat_4 int, 
        cat_5 int, 
        cat_6 int)
  • 3
    thanks.. i tried that, but get this error.... ERROR: invalid return type DETAIL: SQL rowid datatype does not match return rowid datatype. ********** Error ********** ERROR: invalid return type SQL state: 42601 Detail: SQL rowid datatype does not match return rowid datatype. – Siraj Jan 15 '12 at 21:59
  • tried it with count(value), using text rather than int after 'cat_0' etc. still ends with same error: sql state: 42601 – Siraj Jan 15 '12 at 22:51