df.rolling
can accept a string frequency offset as its first argument. For example,
import numpy as np
import pandas as pd
np.random.seed(2018)
# Generate a DataFrame with an irregular DatetimeIndex
N = 20
start = np.datetime64('2018-01-01').astype('M8[s]').view('<i8')
end = np.datetime64('2018-02-01').astype('M8[s]').view('<i8')
timestamps = np.random.uniform(start, end, size=N)
timestamps.sort()
index = timestamps.astype('M8[s]')
df = pd.DataFrame(np.random.randint(10, size=(N, 4)), columns=list('OHLC'),
index=index)
This computes a rolling mean using a 2-day window size:
df.rolling('2D').mean()
This computes a rolling mean using a 7-day (i.e. weekly) window size:
df.rolling('7D').mean()
Use 1H
for a 1-hour window, 1D
for a 1-day window, and 7D
for a 1-week window.
The number of rows corresponding to the rolling window need not be constant.
To check that the above code is producing the desired result, let's confirm the
last two rows of df.rolling('7D').mean()
.
In [91]: df.rolling('7D').mean().tail(2)
Out[91]:
O H L C
2018-01-30 05:22:18 4.285714 3.000000 5.0 3.428571
2018-01-31 23:45:18 3.833333 2.833333 4.5 3.166667
The last row corresponds to means taken over this 7-day DataFrame:
In [93]: end = df.index[-1]; window = df.loc[end-pd.Timedelta(days=7):end]; window
Out[93]:
O H L C
2018-01-25 21:17:07 1 2 1 2
2018-01-26 22:48:38 6 0 3 1
2018-01-28 08:28:04 0 8 7 5
2018-01-29 02:48:53 8 0 2 3
2018-01-30 05:22:18 6 0 8 8
2018-01-31 23:45:18 2 7 6 0
In [94]: window.mean()
Out[94]:
O 3.833333
H 2.833333
L 4.500000
C 3.166667
dtype: float64
The values in window.mean()
match the values in the last row of df.rolling('7D').mean()
.
Similarly, we can confirm the result in the second to last row by setting end = df.index[-2]
:
In [95]: end = df.index[-2]; window = df.loc[end-pd.Timedelta(days=7):end]; window
Out[95]:
O H L C
2018-01-23 12:05:33 9 8 9 4
2018-01-24 11:16:36 0 3 5 1
2018-01-25 21:17:07 1 2 1 2
2018-01-26 22:48:38 6 0 3 1
2018-01-28 08:28:04 0 8 7 5
2018-01-29 02:48:53 8 0 2 3
2018-01-30 05:22:18 6 0 8 8
In [96]: window.mean()
Out[96]:
O 4.285714
H 3.000000
L 5.000000
C 3.428571
dtype: float64
In [99]: window.mean().equals(df.rolling('7D').mean().loc[end])
Out[99]: True
Notice that the actual number of rows in the windows differ (6 vs 7).