How do I sum the values of a column based on another column?

Asked Dec 18 '17 at 18:25

Active Dec 18 '17 at 19:03

Viewed 105 times

I have the following dataframe -

id   count   category   class
1      1       A        X 
1      10      B        X
1      2       B        X
2      3       A        Y
3      5       A        Z
3      6       A        Z

I want to sum all the occurences of 'value' based on the 'category' that it belongs to, for each ID. And then get all the categories as separate columns. I also want to leave the fourth column as is. So, I want to get something like this-

id    A   B    class
1     1   12   X 
2     3   0    Y
3     11  0    Z

I've tried messing around with trying to get dummy variables and using groupby but I can't figure it out. Any suggestions?

edited Dec 18 '17 at 19:03

eyllanesc

235,170
19
170
241

asked Dec 18 '17 at 18:25

mDe

3

`df.pivot_table(index='id', columns='category', values='count', aggfunc='sum', fill_value=0)`. This is a dupe. – cs95 Dec 18 '17 at 18:27
2

`df.groupby(['id','category'])['count'].sum().unstack(fill_value=0)` – BENY Dec 18 '17 at 18:27
I'd recommend reading this: http://pandas.pydata.org/pandas-docs/stable/groupby.html – jjj Dec 18 '17 at 18:31
modified the question slightly-- not sure if this still warrants a different answer. I'm pretty lost, sorry! – mDe Dec 18 '17 at 18:48
1

Nah, it doesn't. All you need is `df.groupby('id').class.first()` and concatenate this column with the result from either Wen's answer or mine. – cs95 Dec 18 '17 at 18:58

How do I sum the values of a column based on another column?

0 Answers0