2

I have the following table :

user      category    number
1         A           8
1         B           6
2         A           1
2         C           9
3         B           5

I want to "unfold" or "dummify" the category column and fill them with the "number" column to obtain:

user      cat_A    cat_B    cat_C
1         8        6        0
2         1        0        9
3         0        5        0

Is it possible to achieve this in SQL (Impala) ?

I found this question How to create dummy variable columns for thousands of categories in Google BigQuery?

However it seems a little bit complex and I'd rather do it in Pandas.

Is there a simpler solution, knowing that I have 10 categories (A, B, C, D etc)?

Vincent
  • 1,534
  • 3
  • 20
  • 42

1 Answers1

3

You can try to use condition aggregate function.

SELECT user,
        SUM(CASE WHEN category = 'A' THEN number ELSE 0 END) cat_A,  
        SUM(CASE WHEN category = 'B' THEN number ELSE 0 END) cat_B,  
        SUM(CASE WHEN category = 'C' THEN number ELSE 0 END) cat_C     
FROM T 
GROUP BY user      
D-Shih
  • 44,943
  • 6
  • 31
  • 51