python convolution with different dimension

Question

I'm trying to implement convolutional neural network in Python.
However, when I use signal.convolve or np.convolve, it can not do convolution on X, Y(X is 3d, Y is 2d). X are training minibatches. Y are filters. I don't want to do for loop for every training vector like:

for i in xrange(X.shape[2]):
    result = signal.convolve(X[:,:,i], Y, 'valid')
    ....

So, is there any function I can use to do convolution efficiently?

Imanol Luengo · Answer 1 · 2016-08-30T13:19:23.583

Scipy implements standard N-dimensional convolutions, so that the matrix to be convolved and the kernel are both N-dimensional.

A quick fix would be to add an extra dimension to Y so that Y is 3-Dimensional:

result = signal.convolve(X, Y[..., None], 'valid')

I'm assuming here that the last axis corresponds to the image index as in your example [width, height, image_idx] (or [height, width, image_idx]). If it is the other way around and the images are indexed in the first axis (as it is more common in C-ordering arrays) you should replace Y[..., None] with Y[None, ...].

The line Y[..., None] will add an extra axis to Y, making it 3-dimensional [kernel_width, kernel_height, 1] and thus, converting it to a valid 3-Dimensional convolution kernel.

NOTE: This assumes that all your input mini-batches have the same width x height, which is standard in CNN's.

EDIT: Some timings as @Divakar suggested.

The testing framework is setup as follows:

def test(S, N, K):
    """ S: image size, N: num images, K: kernel size"""
    a = np.random.randn(S, S, N)
    b = np.random.randn(K, K)
    valid = [slice(K//2, -K//2+1), slice(K//2, -K//2+1)]

    %timeit signal.convolve(a, b[..., None], 'valid')
    %timeit signal.fftconvolve(a, b[..., None], 'valid')
    %timeit ndimage.convolve(a, b[..., None])[valid]

Find bellow tests for different configurations:

Varying image size S:

>>> test(100, 50, 11) # 100x100 images
1 loop, best of 3: 909 ms per loop
10 loops, best of 3: 116 ms per loop
10 loops, best of 3: 54.9 ms per loop

>>> test(1000, 50, 11) # 1000x1000 images
1 loop, best of 3: 1min 51s per loop
1 loop, best of 3: 16.5 s per loop
1 loop, best of 3: 5.66 s per loop

Varying number of images N:

>>> test(100, 5, 11) # 5 images
10 loops, best of 3: 90.7 ms per loop
10 loops, best of 3: 26.7 ms per loop
100 loops, best of 3: 5.7 ms per loop

>>> test(100, 500, 11) # 500 images
1 loop, best of 3: 9.75 s per loop
1 loop, best of 3: 888 ms per loop
1 loop, best of 3: 727 ms per loop

Varying kernel size K:

>>> test(100, 50, 5) # 5x5 kernels
1 loop, best of 3: 217 ms per loop
10 loops, best of 3: 100 ms per loop
100 loops, best of 3: 11.4 ms per loop

>>> test(100, 50, 31) # 31x31 kernels
1 loop, best of 3: 4.39 s per loop
1 loop, best of 3: 220 ms per loop
1 loop, best of 3: 560 ms per loop

So, in short, ndimage.convolve is always faster, except when the kernel size is very large (as K = 31 in the last test).

Good find! For a faster version use `ndimage.filters.convolve(X,Y[...,None])` maybe with some slicing? — Divakar, Aug 30 '16 at 12:15
@Divakar it would be nice to time it (I'll do it if I can get some time), but I think that `ndimage,filters.convolve` should be slower for large kernel sizes (as it does convolution in the spatial domain whereas `signal.convolve` does it in the fourier domain). The advantage of the `ndimage` filters are that support different boundary conditions (different from fourier which I think that by definition is equivalent to padding the image with 0s). I'll do the timings asap! — Imanol Luengo, Aug 30 '16 at 12:20
@Divakar tests done! I'm feeling like this is some kind of dejavu. As you suggested, `ndimage.convolve` tends to be faster in general. I remember having a similar conversation about this in a different thread haha — Imanol Luengo, Aug 30 '16 at 13:21
You know what that's good memory and not just dejavu! ;) Here's the post I think you were referring to : http://stackoverflow.com/a/38232755/3293881 — Divakar, Aug 30 '16 at 13:25
@Divakar Haha exactly that one yes! That time I was proposing `uniform_filter` over `signal.convolve` lol. Although `uniform_filter` was faster due to the *separability* of the kernels, the above results seem to confirm the conclusion we arrived in that thread: for large kernel sizes fourier (fftconvolve) > spatial, if not spatial > fourier. — Imanol Luengo, Aug 30 '16 at 13:31

python convolution with different dimension

1 Answers1

Linked

Related