I have a pandas dataframe
:
df = pd.DataFrame(data={'Name':['John', 'John', 'John', 'Jill', 'Jill', 'Jill', 'Jill'],'Score':[1,1, 1,2,2, 2, 2]})
df
Out[168]:
Name Score
0 John 1
1 John 1
2 John 1
3 Jill 2
4 Jill 2
5 Jill 2
6 Jill 2
I want to add a column that contains the rolling sum of the previous 2 values in a column (column score
here) like so:
Out[171]:
Name Score Total
0 John 1 1
1 John 1 2
2 John 1 2
3 Jill 2 2
4 Jill 2 4
5 Jill 2 4
6 Jill 2 4
I have tried groupby
and rolling
which works but the row order is not the same as the original dataframe
. Notice how Jill
is now first, probably due to some alphabetical sorting:
df.groupby('Name')['Score'].rolling(2, min_periods=1).sum()
Out[173]:
Name
Jill 3 2.0
4 4.0
5 4.0
6 4.0
John 0 1.0
1 2.0
2 2.0
Name: Score, dtype: float64
How does this work and how can I achieve the goal without doing a leftjoin / merge afterwards?