0

I'm trying to sum up the values in different columns when the ID appears multiple time within the dataframe. Here is what I came up with, but I doubt this is an efficient way to do it. Any suggestions ?

import pandas as pd

df = pd.DataFrame({"ID":["Hash", "Random","Ashe", "Hash"], "RandomData":[12, 2, 32, 1], "SecondRandom":[1, 3, 4, 45]})

ids = []
data_first = []
data_sec = []
for i in list(df["ID"].unique()):
    tmp = df[df["ID"] == i]
    ids.append(i)
    data_first.append(tmp["RandomData"].sum())
    data_sec.append(tmp["SecondRandom"].sum())
    
new_df = pd.DataFrame({"ID":ids, "RandomData":data_first, "SecondRandom":data_sec})
Achille G
  • 748
  • 6
  • 19

1 Answers1

1

Please try

df.groupby('ID').agg('sum')
wwnde
  • 26,119
  • 6
  • 18
  • 32