My Pandas Dataframe is in this format:
A 5
A 7
A 4
B 2
B 7
C 8
How could I summarize to this:
A 16
B 9
C 8
My Pandas Dataframe is in this format:
A 5
A 7
A 4
B 2
B 7
C 8
How could I summarize to this:
A 16
B 9
C 8
You can use groupby:
col1 col2
0 A 5
1 A 7
2 A 4
3 B 2
4 B 7
5 C 8
df.groupby('col1')['col2'].sum()
col1
A 16
B 9
C 8
If you want to keep the columns as they are, as you mentioned in your comment, you can convert the groupby
object to a new dataframe, if this is what you meant. So, you can do this instead:
new = pd.DataFrame({'col2' : df.groupby('col1')['col2'].sum()}).reset_index()
new
col1 col2
0 A 16
1 B 9
2 C 8
I think you could use pivot_table
for that with sum
as aggregation function:
In [9]: df
Out[9]:
0 1
0 A 5
1 A 7
2 A 4
3 B 2
4 B 7
5 C 8
In [10]: df.pivot_table(index=0, aggfunc=sum).reset_index()
Out[10]:
0 1
0 A 16
1 B 9
2 C 8