Moving average with pandas using the 2 prior occurrences

Question

I was able to find the proper formula for a Moving average here: Moving Average SO Question

The issue is it is using the 1 occurrence prior and the current rows input. I am trying to use the 2 prior occurrence to the row I am trying to predict.

import pandas as pd
import numpy as np

df = pd.DataFrame({'person':['john','mike','john','mike','john','mike'],
                    'pts':[10,9,2,2,5,5]})

df['avg'] = df.groupby('person')['pts'].transform(lambda x: x.rolling(2).mean())

OUTPUT:

From the output we see that Johns second entry is using his first and the current row to Avg. What I am looking for is John and Mikes last occurrences to be John: 6 and Mike: 5.5 using the prior two, not the previous one and the current rows input. I am using this for a prediction and would not know the current rows pts because they haven't happend yet. New to Machine Learning and this was my first thought for a feature.

score 3 · Accepted Answer · answered Feb 22 '20 at 16:03

If want shift per groups add Series.shift to lambda function:

df['avg'] = df.groupby('person')['pts'].transform(lambda x: x.rolling(2).mean().shift())
print (df)
  person  pts  avg
0   john   10  NaN
1   mike    9  NaN
2   john    2  NaN
3   mike    2  NaN
4   john    5  6.0
5   mike    5  5.5

Grzegorz Skibinski · Answer 2 · 2020-02-22T21:17:13.113

0

Try:

df['avg'] = df.groupby('person').rolling(3)['pts'].sum().reset_index(level=0, drop=True)

df['avg']=df['avg'].sub(df['pts']).div(2)

Outputs:

  person  pts  avg
0   john   10  NaN
1   mike    9  NaN
2   john    2  NaN
3   mike    2  NaN
4   john    5  6.0
5   mike    5  5.5

edited Feb 22 '20 at 21:17

answered Feb 22 '20 at 16:08

Grzegorz Skibinski

12,624
2
11
34

Moving average with pandas using the 2 prior occurrences

2 Answers2