Suppose I have the following data frame:
import pandas as pd
df = pd.DataFrame()
df['ID'] = 1, 1, 1, 2, 2, 3, 3
df['a'] = 3, 5, 6, 3, 8, 1, 2
I want to create a for loop that loops over ID and returns the sum of 'a' for that ID. So far I have this:
for i in df['ID']:
print(i, df.loc[df['ID'] == i, 'a'].sum())
However this returns multiples of the same value like so:
1 14
1 14
1 14
2 11
2 11
3 3
3 3
How do I edit my pool so that once it has returned the value for 'id' == 1 it moves on to the next id value rather than just down to the next row?
I'm looking to get the following:
1 14
2 11
3 3
Thanks in advance!