Think an efficient one would be with 1D convolution
-
def sum_occurences_windowed(arr, W):
K = np.ones(W, dtype=int)
out = np.convolve(arr==0,K)[:len(arr)]
out[:W-1] = 0
return out
Sample run -
In [42]: arr
Out[42]: array([10, 20, 30, 5, 6, 0, 0, 0])
In [43]: sum_occurences_windowed(arr,W=7)
Out[43]: array([0, 0, 0, 0, 0, 0, 2, 3])
Timings on varying length arrays and window of 7
Including count_rolling
from @Quang Hoang's post
.
Using benchit
package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.
import benchit
funcs = [sum_occurences_windowed, count_rolling]
in_ = {n:(np.random.randint(0,5,(n)),7) for n in [10,20,50,100,200,500,1000,2000,5000]}
t = benchit.timings(funcs, in_, multivar=True, input_name='Length')
t.plot(logx=True, save='timings.png')

Extending to generic n-dim arrays
from scipy.ndimage.filters import convolve1d
def sum_occurences_windowed_ndim(arr, W, axis=-1):
K = np.ones(W, dtype=int)
out = convolve1d((arr==0).astype(int),K,axis=axis,origin=-(W//2))
out.swapaxes(axis,0)[:W-1] = 0
return out
So, on a 2D array, for counting along each row, use axis=1
and for cols, axis=0
and so on.
Sample run -
In [155]: np.random.seed(0)
In [156]: a = np.random.randint(0,3,(3,10))
In [157]: a
Out[157]:
array([[0, 1, 0, 1, 1, 2, 0, 2, 0, 0],
[0, 2, 1, 2, 2, 0, 1, 1, 1, 1],
[0, 1, 0, 0, 1, 2, 0, 2, 0, 1]])
In [158]: sum_occurences_windowed_ndim(a, W=7)
Out[158]:
array([[0, 0, 0, 0, 0, 0, 3, 2, 3, 3],
[0, 0, 0, 0, 0, 0, 2, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 4, 3, 4, 3]])
# Verify with earlier 1D solution
In [159]: np.vstack([sum_occurences_windowed(i,7) for i in a])
Out[159]:
array([[0, 0, 0, 0, 0, 0, 3, 2, 3, 3],
[0, 0, 0, 0, 0, 0, 2, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 4, 3, 4, 3]])
Let's test out our original 1D input array -
In [187]: arr
Out[187]: array([10, 20, 30, 5, 6, 0, 0, 0])
In [188]: sum_occurences_windowed_ndim(arr, W=7)
Out[188]: array([0, 0, 0, 0, 0, 0, 2, 3])