Approach #1
We can leverage np.lib.stride_tricks.as_strided
based scikit-image's view_as_windows
to get sliding windows. More info on use of as_strided
based view_as_windows
.
from skimage.util.shape import view_as_windows
def create_time_lagged_viewaswindows(X, shift, step):
a_ext = np.r_[X.values,np.zeros(shift-1,dtype=X.dtype)]
windows_ar = view_as_windows(a_ext,shift)[:len(X)-shift+step+1:step].T
return pd.DataFrame(windows_ar)
Bit of explanation : The basic idea is we pad on the trailing side with zeros and then create sliding windows. To create the windows, we make use of np.lib.stride_tricks.as_strided
or skimage.util.view_as_windows
.
Sample runs -
In [166]: X = pd.Series(range(5))
In [167]: create_time_lagged_viewaswindows(X, shift=4, step=1)
Out[167]:
0 1 2
0 0 1 2
1 1 2 3
2 2 3 4
3 3 4 0
In [168]: create_time_lagged_viewaswindows(X, shift=4, step=2)
Out[168]:
0 1
0 0 2
1 1 3
2 2 4
3 3 0
Approach #2
We can also make use of np.lib.stride_tricks.as_strided
that would require us to manually setup the strides and shape arg with it, but we would avoid the transpose as used with earlier approach and that might be worth the extra performance boost. The implementation would look something along these lines -
def create_time_lagged_asstrided(X, shift, step):
a_ext = np.r_[X.values,np.zeros(shift-1,dtype=X.dtype)]
strided = np.lib.stride_tricks.as_strided
s = a_ext.strides[0]
ncols = (len(X)-shift+2*step)//step
windows_ar = strided(a_ext, shape=(shift,ncols), strides=(s,step*s))
return pd.DataFrame(windows_ar)
Timings on large array -
In [215]: X = pd.Series(range(10000))
# Original solution
In [216]: %timeit creat_time_lagged(X, shift=10, step=5)
1 loop, best of 3: 608 ms per loop
# Approach #1
In [217]: %timeit create_time_lagged_viewaswindows(X, shift=10, step=5)
10000 loops, best of 3: 146 µs per loop
# Approach #2
In [218]: %timeit create_time_lagged_asstrided(X, shift=10, step=5)
10000 loops, best of 3: 104 µs per loop