5

How do i calculate a rolling mean or moving average where i consider all the items that I have seen so far.

Lets say I have a data-frame like below

   col   new_col
0    1      1
1    2      1.5
2    3      2

and so on. Now i would like to add a new column where i caclulate the average of all the items in col until that point. Specifying a window will mean that i will get the first few as Nan and then it only does a rolling window. But i need something like above.

AMM
  • 17,130
  • 24
  • 65
  • 77
  • 5
    [Expanding window moment functions](http://pandas.pydata.org/pandas-docs/stable/computation.html#expanding-window-moment-functions) – Karl D. May 17 '14 at 22:31
  • For your case, it is `df.expanding().mean()`. I couldn't find a better duplicate target but I that post summarizes expanding calculations. – ayhan Jun 20 '17 at 20:09

1 Answers1

0

The snippet below will do exactly what you're requesting. There is plenty of room for improvement though. It uses a for loop with an if-else statetment. There are surely faster ways to do this with a vectorized function. It will also trigger the SettingsWithCopyWarning if you omit the pd.options.mode.chained_assignment = None part.

But it does the job:

# Libraries
import pandas as pd
import numpy as np

# Settings
pd.options.mode.chained_assignment = None

# Dataframe with desired input
df = pd.DataFrame({'col':[1,2,3]})

# Make room for a new column
df['new_col'] = np.nan

# Fill the new column with values
for i in df.index + 1:
    if i == 0:
        df['new_col'].iloc[i] = np.nan
    else:
        df['new_col'].iloc[i-1] = pd.rolling_mean(df.col.iloc[:i].values, window = i)[-1]
print(df)

Output:

enter image description here

vestland
  • 55,229
  • 37
  • 187
  • 305