0

I have a pandas dataframe:

df = pd.DataFrame(data={'Name':['John', 'John', 'John', 'Jill', 'Jill', 'Jill', 'Jill'],'Score':[1,1, 1,2,2, 2, 2]})

df
Out[168]: 
   Name  Score
0  John      1
1  John      1
2  John      1
3  Jill      2
4  Jill      2
5  Jill      2
6  Jill      2

I want to add a column that contains the rolling sum of the previous 2 values in a column (column score here) like so:

Out[171]: 
   Name  Score  Total
0  John      1      1
1  John      1      2
2  John      1      2
3  Jill      2      2
4  Jill      2      4
5  Jill      2      4
6  Jill      2      4

I have tried groupby and rolling which works but the row order is not the same as the original dataframe. Notice how Jill is now first, probably due to some alphabetical sorting:


df.groupby('Name')['Score'].rolling(2, min_periods=1).sum()
Out[173]: 
Name   
Jill  3    2.0
      4    4.0
      5    4.0
      6    4.0
John  0    1.0
      1    2.0
      2    2.0
Name: Score, dtype: float64

How does this work and how can I achieve the goal without doing a leftjoin / merge afterwards?

gmarais
  • 1,801
  • 4
  • 16
  • 32

0 Answers0