Here's one vectorized approach using NumPy
tools -
windowSize = 4
a = df.values
X = strided_app(a[:,0],windowSize,1)
Y = strided_app(a[:,1],windowSize,1)
M = Y.mean(1)
mask = Y>M[:,None]
sums = np.einsum('ij,ij->i',X,mask)
rest_sums = X.sum(1) - sums
out = sums/rest_sums
strided_app
is taken from here
.
Runtime test -
Approaches -
# @kazemakase's solution
def rolling_window_sum(df, windowSize=4):
rw = rolling_window(df.values.T, windowSize)
m = np.mean(rw[1], axis=-1, keepdims=True)
a = np.sum(rw[0] * (rw[1] > m), axis=-1)
b = np.sum(rw[0] * (rw[1] <= m), axis=-1)
result = a / b
return result
# Proposed in this post
def strided_einsum(df, windowSize=4):
a = df.values
X = strided_app(a[:,0],windowSize,1)
Y = strided_app(a[:,1],windowSize,1)
M = Y.mean(1)
mask = Y>M[:,None]
sums = np.einsum('ij,ij->i',X,mask)
rest_sums = X.sum(1) - sums
out = sums/rest_sums
return out
Timings -
In [46]: df = pd.DataFrame(np.random.randint(0,9,(1000000,2)))
In [47]: %timeit rolling_window_sum(df)
10 loops, best of 3: 90.4 ms per loop
In [48]: %timeit strided_einsum(df)
10 loops, best of 3: 62.2 ms per loop
To squeeze in more performance, we can compute the Y.mean(1)
part, which is basically a windowed summation with Scipy's 1D uniform filter
. Thus, M
could be alternatively computed for windowSize=4
as -
from scipy.ndimage.filters import uniform_filter1d as unif1d
M = unif1d(a[:,1].astype(float),windowSize)[2:-1]
The performance gains are significant -
In [65]: %timeit strided_einsum(df)
10 loops, best of 3: 61.5 ms per loop
In [66]: %timeit strided_einsum_unif_filter(df)
10 loops, best of 3: 49.4 ms per loop