2

I have the following dataframe -

id   count   category   class
1      1       A        X 
1      10      B        X
1      2       B        X
2      3       A        Y
3      5       A        Z
3      6       A        Z 

I want to sum all the occurences of 'value' based on the 'category' that it belongs to, for each ID. And then get all the categories as separate columns. I also want to leave the fourth column as is. So, I want to get something like this-

id    A   B    class
1     1   12   X 
2     3   0    Y
3     11  0    Z

I've tried messing around with trying to get dummy variables and using groupby but I can't figure it out. Any suggestions?

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
mDe
  • 97
  • 1
  • 10
  • 3
    `df.pivot_table(index='id', columns='category', values='count', aggfunc='sum', fill_value=0)`. This is a dupe. – cs95 Dec 18 '17 at 18:27
  • 2
    `df.groupby(['id','category'])['count'].sum().unstack(fill_value=0)` – BENY Dec 18 '17 at 18:27
  • I'd recommend reading this: http://pandas.pydata.org/pandas-docs/stable/groupby.html – jjj Dec 18 '17 at 18:31
  • modified the question slightly-- not sure if this still warrants a different answer. I'm pretty lost, sorry! – mDe Dec 18 '17 at 18:48
  • 1
    Nah, it doesn't. All you need is `df.groupby('id').class.first()` and concatenate this column with the result from either Wen's answer or mine. – cs95 Dec 18 '17 at 18:58

0 Answers0