sum duplicate rows of pandas data frame

Question

I have a data frame like this,

df
col1    col2     col3
 A       X         6
 B       Y         9
 C       Z         10
 B       Y         11
 F       P         7
 G       H         8
 D       Y         4
 G       H         4

Now I want to add col3 values of the rows if col1 and col2 values are duplicated. for example,

B-Y-9 and B-Y-11 are duplicate. So these two rows will become a single row- B-Y-20

So the final data frame should look like,

col1    col2     col3
 A       X         6
 C       Z         10
 B       Y         20
 F       P         7
 D       Y         4
 G       H         12

I can do it using for loop and comparing rows with previous rows. But the execution time will be more, looking for some pandas shotcuts/pythonic way to do it efficiently.

Do you need `df.groupby(['col1','col2'], as_index=False)['col3'].sum()` ? — jezrael, Apr 06 '20 at 07:57

score 1 · Answer 1 · answered Apr 06 '20 at 07:58

1

df.groupby(['col1', 'col2']).sum().reset_index()

answered Apr 06 '20 at 07:58

Tom Ron

5,906
3
22
38

sum duplicate rows of pandas data frame

1 Answers1