The reason uniform_filter()
is so much faster than generic_filter()
is due to Python -- for generic_filter()
, Python gets called for each pixel, while for uniform_filter()
, the whole image is processed in native code. (I found OpenCV's boxFilter()
even faster than uniform_filter()
, see my answer to a "window variance" question.)
In the remainder of this answer, I show how to do a skew calculation using uniform_filter()
, which dramatically speeds up a generic_filter()
-based version such as:
import scipy.ndimage as ndimage, scipy.stats as st
ndimage.generic_filter(img, st.skew, size=(1,5))
SciPy's st.skew()
(see, e.g., v0.17.0) appears to calculate the skew as
m3 / m2**1.5
where m3 = E[(X-m)**3]
(the third central moment), m2 = E[(X-m)**2]
(the variance), and m = E[X]
(the mean).
To use uniform_filter()
, one has to write this in terms of raw moments such as m3p = E[X**3]
and m2p = E[X**2]
(a prime symbol is usually used to distinguish the raw moment from the central one):
m3 = E[(X-m)**3] = ... = m3p - 3*m*m2p + 2*m**3
m2 = E[(X-m)**2] = ... = m2p - m*m
(In case my "..." skips too much, this answer has the full derivation for m2.) Then one can implement skew()
using uniform_filter()
(or boxFilter()
for some additional speedup):
def winSkew(img, wsize):
imgS = img*img
m, m2p, m3p = (ndimage.uniform_filter(x, wsize) for x in (img, imgS, imgS*img))
mS = m*m
return (m3p-3*m*m2p+2*mS*m)/(m2p-mS)**1.5
Compared to generic_filter()
, winSkew()
gives a 654-fold speedup on the following example on my machine:
In [185]: img = np.random.randint(0, 256, (500,500)).astype(np.float)
In [186]: %timeit ndimage.generic_filter(img, st.skew, size=(1,5))
1 loops, best of 3: 14.2 s per loop
In [188]: %timeit winSkew(img, (1,5))
10 loops, best of 3: 21.7 ms per loop
And the two calculations give essentially identical results:
In [190]: np.allclose(winSkew(img, (1,5)), ndimage.generic_filter(img, st.skew, size=(1,5)))
Out[190]: True
The code for a Kurtosis calculation can be derived the same way.