Python Summing up Rows in Dataframe with the same Key

Question

I want to sum up rows in a dataframe which have the same row key.

The purpose will be to shrink the data set size down.

For example if the data frame looks like this.

Fruit       Count

Apple         10

Pear          20

Apple          5

Banana         7

Banana         12

Pear           8  

Apple          10

I want the final dataframe to look like this.

Fruit       Count

Apple         25

Pear          28

Banana        19

I am using Python's pandas, numpy, matplotlib and other data analysis packages. Is there a way to do this in python using functions in these packages?

Here is the code to create the example dataframe.

df = pd.DataFrame([["Apple", 10], ["Pear", 20], ["Apple", 5], ["Banana", 7], ["Banana", 12], ["Pear", 8], ["Apple", 10]], columns=["Fruit", "Count"])

Look into pandas `groupby()` function. It does precisely this. — AirSquid, Feb 05 '19 at 03:06

score 4 · Accepted Answer · answered Feb 05 '19 at 03:05

How about groupby with sum()? e.g df.groupby(['Fruit'])['Count'].sum()

import pandas as pd
df = pd.DataFrame([["Apple", 10], ["Pear", 20], ["Apple", 5], ["Banana", 7], ["Banana", 12], ["Pear", 8], ["Apple", 10]], columns=["Fruit", "Count"])
df = df.groupby(['Fruit'])['Count'].sum()
print(df)

Output:

Fruit
Apple     25
Banana    19
Pear      28

score 4 · Answer 2 · answered Feb 05 '19 at 03:05

4

Use groupby with as_index=False, and sum:

>>> df.groupby('Fruit',as_index=False)['Count'].sum()
    Fruit  Count
0   Apple     25
1  Banana     19
2    Pear     28
>>>

answered Feb 05 '19 at 03:05

U13-Forward

69,221
14
89
114

score 3 · Answer 3 · answered Feb 05 '19 at 03:08

3

Yes! its as easy as

df.groupby("Fruit").sum()

answered Feb 05 '19 at 03:08

lunguini

896
1
9
14

score 2 · Answer 4 · answered Feb 05 '19 at 03:07

2

This should be the shortest way to get what you are after:

df.groupby("Fruit").sum()

Outputs:

Fruit      Count       
Apple      25
Banana     19
Pear       28

answered Feb 05 '19 at 03:07

cullzie

2,705
2
16
21

score 1 · Answer 5 · answered Feb 05 '19 at 03:13

1

use groupby with sum

df = df.groupby('Fruit').sum()
print(df)

Outputs

Fruit      Count     
Apple      25
Banana     19
Pear       28

answered Feb 05 '19 at 03:13

xashru

3,400
2
17
30

Python Summing up Rows in Dataframe with the same Key

5 Answers5