4

I want to sum up rows in a dataframe which have the same row key.

The purpose will be to shrink the data set size down.

For example if the data frame looks like this.

Fruit       Count

Apple         10

Pear          20

Apple          5

Banana         7

Banana         12

Pear           8  

Apple          10

I want the final dataframe to look like this.

Fruit       Count

Apple         25

Pear          28

Banana        19

I am using Python's pandas, numpy, matplotlib and other data analysis packages. Is there a way to do this in python using functions in these packages?

Here is the code to create the example dataframe.

df = pd.DataFrame([["Apple", 10], ["Pear", 20], ["Apple", 5], ["Banana", 7], ["Banana", 12], ["Pear", 8], ["Apple", 10]], columns=["Fruit", "Count"])
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
mrsquid
  • 605
  • 2
  • 9
  • 24

5 Answers5

4

How about groupby with sum()? e.g df.groupby(['Fruit'])['Count'].sum()

import pandas as pd
df = pd.DataFrame([["Apple", 10], ["Pear", 20], ["Apple", 5], ["Banana", 7], ["Banana", 12], ["Pear", 8], ["Apple", 10]], columns=["Fruit", "Count"])
df = df.groupby(['Fruit'])['Count'].sum()
print(df)

Output:

Fruit
Apple     25
Banana    19
Pear      28
A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103
4

Use groupby with as_index=False, and sum:

>>> df.groupby('Fruit',as_index=False)['Count'].sum()
    Fruit  Count
0   Apple     25
1  Banana     19
2    Pear     28
>>> 
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
3

Yes! its as easy as

df.groupby("Fruit").sum()

lunguini
  • 896
  • 1
  • 9
  • 14
2

This should be the shortest way to get what you are after:

df.groupby("Fruit").sum()

Outputs:

Fruit      Count       
Apple      25
Banana     19
Pear       28
cullzie
  • 2,705
  • 2
  • 16
  • 21
1

use groupby with sum

df = df.groupby('Fruit').sum()
print(df)

Outputs

Fruit      Count     
Apple      25
Banana     19
Pear       28
xashru
  • 3,400
  • 2
  • 17
  • 30