Summing rows in pandas dataframe which have a common value in one of the columns

Question

I have a pandas dataframe of numeric data, e.g.

	A	B	C
1	a1	b1	1
2	a2	b2	2
3	a3	b3	1
4	a4	b4	2

where a_n and b_n are just any numeric values.

Notice in column C there are 2 entries of 1 and 2 entries of 2. How do I deal with these duplicates such that I sum rows that have duplicate entries in C? i.e. the result is:

	A	B	C
1	a1+a3	b1+b3	1
2	a2+a4	b2+b4	2

mozway · Answer 1 · 2021-08-23T10:55:30.963

0

Simply groupby C and sum:

df.groupby('C', as_index=False).sum()

To keep the original order of the columns:

df.groupby('C', as_index=False).sum()[df.columns]

edited Aug 23 '21 at 10:55

answered Aug 23 '21 at 10:53

mozway

194,879
13
39
75

dupe, unfortunately – jezrael Aug 23 '21 at 10:55
I tried to bring some added value, but I guess OP could have searched better before asking – mozway Aug 23 '21 at 10:57
how do I also retain the original index? This answer works, except for keeping the original index. For example, in my question the index starts at 1 instead of 0. After doing groupby and sum, the index starts at 0. – ogb119 Aug 23 '21 at 12:39
nevermind it doesn't make sense to retain the original index, ignore that! thanks for the answer. – ogb119 Aug 23 '21 at 12:43
In python indexes and most counters start from 0, Assuming you save the output of the groupby+sum in `df_out` you could do `df_out.index += 1` – mozway Aug 23 '21 at 12:44

Summing rows in pandas dataframe which have a common value in one of the columns

1 Answers1