I have a pandas dataframe as follows:
df = pd.DataFrame({
'State':['am','am','am','am','am','am','fg','fg','fg','fg','fg','fg','fg'],
'PC':['A','A','A','B','B','B','C','C','C','A','A','A','A'],
'Party':['alpha','beta','delta','alpha','beta','gamma','alpha','beta','kappa','alpha','gamma','kappa','lambda'],
'Votes':[10,4,50,11,2,60,3,1,70,12,34,52,43]
})
I want to add a Total column, which will contain the sums of the votes for each PC. Note that the PC can have the same name (e.g. 'A' above in two different states 'am' and 'fg', so we want to sum them separately, since they are different pc).
In summary, this is the output I want:
df_out = pd.DataFrame({
'State':['am','am','am','am','am','am','fg','fg','fg','fg','fg','fg','fg'],
'PC':['A','A','A','B','B','B','C','C','C','A','A','A','A'],
'Party':['alpha','beta','delta','alpha','beta','gamma','alpha','beta','kappa','alpha','gamma','kappa','lambda'],
'Votes':[10,4,50,11,2,60,3,1,70,12,34,52,43],
'Total':[64,64,64,73,73,73,74,74,74,141,141,141,141]
})
I was trying to use cumsum
df['cumsum'] = df['Votes'].cumsum()
But this is producing a cumulative sum and I dont see a way to condition on the values in columns 'State' and columns 'pc'. How do I do that to get desired output?