Sum columns in Pandas based on row values

Question

I am trying to set up a script to help to order garment blanks. I have a dataset that looks like this:

|  design  | s | m | l | xl | style   | color |
|----------|---|---|---|----|---------|-------|
| design 1 | 5 | 3 | 6 |  1 | style 1 | black |
| design 2 | 4 | 6 | 9 |  5 | style 2 | red   |
| design 3 | 2 | 6 | 5 |  8 | style 1 | red   |
| design 4 | 6 | 8 | 4 |  1 | style 1 | black |
| design 5 | 8 | 2 | 1 |  1 | style 1 | blue  |
| design 6 | 6 | 9 | 5 |  4 | style 2 | red   |

And I would like to be able to use Pandas to basically sum the totals of each style / color pair so I can order the total amount.

Given the data above, I would like the output to be something like:

| style   | color | s  | m  | l  | xl |
|---------|-------|----|----|----|----|
| style 1 | black | 11 | 11 | 10 | 2  | 
| style 1 | red   | 2  | 6  | 5  | 8  |
| style 1 | blue  | 8  | 2  | 1  | 1  |
| style 2 | red   | 10 | 15 | 14 | 9  |

Ok, pretty standard `groupby` and `agg` work. What went wrong with your attempt? — roganjosh, Mar 02 '19 at 20:32
Possible duplicate of [Pandas sum by groupby, but exclude certain columns](https://stackoverflow.com/questions/32751229/pandas-sum-by-groupby-but-exclude-certain-columns) — anky, Mar 02 '19 at 20:33

score 0 · Accepted Answer · answered Mar 02 '19 at 20:44

0

df[['style', 'color', 's','m','l','xl']].groupby(by=['style','color']).sum()

you can add .sort() if you want to sort by items.

answered Mar 02 '19 at 20:44

Ravi

659
2
10
32

Worked exactly for what I needed. Thank you. – Mitchbart Mar 03 '19 at 14:34

score -1 · Answer 2 · edited Mar 02 '19 at 20:57

-1

df.groupby("style").cumsum

groupby and cumsum will do what you want.

edited Mar 02 '19 at 20:57

Wai Ha Lee

8,598
83
57
92

answered Mar 02 '19 at 20:34

Arnold Chung

125
2
14

Why `cumsum` and not `sum`? – roganjosh Mar 02 '19 at 20:35
Oh wait, first it should be groupby "style" and "color". Second, I thought it will summed up. isn't it? – Arnold Chung Mar 02 '19 at 20:40
`cumsum` is for a rolling sum over multiple values. If you want the total, you just want `sum` – roganjosh Mar 02 '19 at 20:42
But, in his/her case, it is groupby "style" and "color", so returned result will be what he wants. – Arnold Chung Mar 02 '19 at 20:43
But you haven't grouped by "color", as you said – roganjosh Mar 02 '19 at 20:44

Sum columns in Pandas based on row values

2 Answers2