Sum values for each duplicate ID

Question

I'm trying to sum up the values in different columns when the ID appears multiple time within the dataframe. Here is what I came up with, but I doubt this is an efficient way to do it. Any suggestions ?

import pandas as pd

df = pd.DataFrame({"ID":["Hash", "Random","Ashe", "Hash"], "RandomData":[12, 2, 32, 1], "SecondRandom":[1, 3, 4, 45]})

ids = []
data_first = []
data_sec = []
for i in list(df["ID"].unique()):
    tmp = df[df["ID"] == i]
    ids.append(i)
    data_first.append(tmp["RandomData"].sum())
    data_sec.append(tmp["SecondRandom"].sum())
    
new_df = pd.DataFrame({"ID":ids, "RandomData":data_first, "SecondRandom":data_sec})

Use: `df.groupby('ID', as_index=False).sum()` – jezrael Feb 17 '22 at 10:57 — jezrael, Feb 17 '22 at 10:57

score 1 · Answer 1 · answered Feb 17 '22 at 10:57

1

Please try

df.groupby('ID').agg('sum')

answered Feb 17 '22 at 10:57

wwnde

26,119
6
18
32

Sum values for each duplicate ID

1 Answers1