2d convolution using python and numpy

Question

I am trying to perform a 2d convolution in python using numpy

I have a 2d array as follows with kernel H_r for the rows and H_c for the columns

data = np.zeros((nr, nc), dtype=np.float32)

#fill array with some data here then convolve

for r in range(nr):
    data[r,:] = np.convolve(data[r,:], H_r, 'same')

for c in range(nc):
    data[:,c] = np.convolve(data[:,c], H_c, 'same')

data = data.astype(np.uint8);

It does not produce the output that I was expecting, does this code look OK, I think the problem is with the casting from float32 to 8bit. Whats the best way to do this

Thanks

How are you casting this in Matlab? Is it a difference of rounding vs. truncation? — Justin Peel, Mar 15 '10 at 22:44
Better code for 2D convolution was given in this later Q&A: https://stackoverflow.com/q/64587303/7328782 (not an exact duplicate). — Cris Luengo, Mar 12 '23 at 17:01

score 26 · Answer 1 · edited Mar 03 '22 at 14:24

26

Maybe it is not the most optimized solution, but this is an implementation I used before with numpy library for Python:

def convolution2d(image, kernel, bias):
    m, n = kernel.shape
    if (m == n):
        y, x = image.shape
        y = y - m + 1
        x = x - m + 1
        new_image = np.zeros((y,x))
        for i in range(y):
            for j in range(x):
                new_image[i][j] = np.sum(image[i:i+m, j:j+m]*kernel) + bias
    return new_image

I hope this code helps other guys with the same doubt.

Regards.

edited Mar 03 '22 at 14:24

erip

16,374
11
66
121

answered Mar 03 '17 at 12:47

omotto

1,721
19
20

2

Shouldn't there be `j:j+n` instead of `j:j+m` in `image[i:i+m, j:j+m]` ? – Bogdan Kandra Nov 27 '19 at 11:57
1

Not necessary, because you have the condition `m = n` – omotto Nov 27 '19 at 16:02
1

I thnk you are missing the flipping of the kernel – OuttaSpaceTime Feb 10 '22 at 22:14

berna1111 · Answer 2 · 2019-01-06T02:20:10.797

Edit [Jan 2019]

@Tashus comment bellow is correct, and @dudemeister's answer is thus probably more on the mark. The function he suggested is also more efficient, by avoiding a direct 2D convolution and the number of operations that would entail.

Possible Problem

I believe you are doing two 1d convolutions, the first per columns and the second per rows, and replacing the results from the first with the results of the second.

Notice that numpy.convolve with the 'same' argument returns an array of equal shape to the largest one provided, so when you make the first convolution you already populated the entire data array.

One good way to visualize your arrays during these steps is to use Hinton diagrams, so you can check which elements already have a value.

Possible Solution

You can try to add the results of the two convolutions (use data[:,c] += .. instead of data[:,c] = on the second for loop), if your convolution matrix is the result of using the one dimensional H_r and H_c matrices like so:

Another way to do that would be to use scipy.signal.convolve2d with a 2d convolution array, which is probably what you wanted to do in the first place.

Not "replacing the results from the first with the results of the second", but rather convolving each row with the horizontal kernel, then convolving each column of those results with the vertical kernel. This is a particular mode of conv in MATLAB. — Tashus, Jan 04 '19 at 20:09
You are right, on the second loop each array element already has the result from the first convolution - the equivalent H2d would have non-null elements on the corners, which is probably better... I just realised this is what is used for blur filters on pictures to avoid the enormous number of operations a direct 2D convolution would require. Then @dudemeister 's answer is probably on the right track. — berna1111, Jan 06 '19 at 02:15

score 5 · Answer 3 · answered Mar 15 '10 at 15:18

5

Since you already have your kernel separated you should simply use the sepfir2d function from scipy:

from scipy.signal import sepfir2d
convolved = sepfir2d(data, H_r, H_c)

On the other hand, the code you have there looks all right ...

answered Mar 15 '10 at 15:18

dudemeister

104
2

Hi Dudemaster, I think the problem is that I am casting the output to 8bit using this command data = np.array(data,dtype=np.int8) Is this OK – mikip Mar 15 '10 at 15:31
@mikip are your numbers in the range of -128 to 127 before you convert them to 8bit? If not, then that is drastically changing your output. – Justin Peel Mar 15 '10 at 16:11
Well that really depends on the implementation of the convolve and also your kernel. It might be worth a try to cast both your kernel and data to float or int32 at least. Note that any decent 8bit convolution algorithm should work with (at least) 16bit temporary values because the summing during the convolve can easily overfloat 8bit values, depending on the kernel. – dudemeister Mar 16 '10 at 12:33

score 3 · Answer 4 · answered Dec 12 '20 at 21:20

I checked out many implementations and found none for my purpose, which should be really simple. So here is a dead-simple implementation with for loop

def convolution2d(image, kernel, stride, padding):
    image = np.pad(image, [(padding, padding), (padding, padding)], mode='constant', constant_values=0)

    kernel_height, kernel_width = kernel.shape
    padded_height, padded_width = image.shape

    output_height = (padded_height - kernel_height) // stride + 1
    output_width = (padded_width - kernel_width) // stride + 1

    new_image = np.zeros((output_height, output_width)).astype(np.float32)

    for y in range(0, output_height):
        for x in range(0, output_width):
            new_image[y][x] = np.sum(image[y * stride:y * stride + kernel_height, x * stride:x * stride + kernel_width] * kernel).astype(np.float32)
    return new_image

score 2 · Answer 5 · edited Apr 15 '22 at 02:26

It might not be the most optimized solution either, but it is approximately ten times faster than the one proposed by @omotto and it only uses basic numpy function (as reshape, expand_dims, tile...) and no 'for' loops:

def gen_idx_conv1d(in_size, ker_size):
    """
    Generates a list of indices. This indices correspond to the indices
    of a 1D input tensor on which we would like to apply a 1D convolution.

    For instance, with a 1D input array of size 5 and a kernel of size 3, the
    1D convolution product will successively looks at elements of indices [0,1,2],
    [1,2,3] and [2,3,4] in the input array. In this case, the function idx_conv1d(5,3) 
    outputs the following array: array([0,1,2,1,2,3,2,3,4]).

    args:
        in_size: (type: int) size of the input 1d array.
        ker_size: (type: int) kernel size.

    return:
        idx_list: (type: np.array) list of the successive indices of the 1D input array
        access to the 1D convolution algorithm.

    example:
        >>> gen_idx_conv1d(in_size=5, ker_size=3)
        array([0, 1, 2, 1, 2, 3, 2, 3, 4])
    """
    f = lambda dim1, dim2, axis: np.reshape(np.tile(np.expand_dims(np.arange(dim1),axis),dim2),-1)
    out_size = in_size-ker_size+1
    return f(ker_size, out_size, 0)+f(out_size, ker_size, 1)

def repeat_idx_2d(idx_list, nbof_rep, axis):
    """
    Repeats an array of indices (idx_list) a number of time (nbof_rep) "along" an axis
    (axis). This function helps to browse through a 2d array of size
    (len(idx_list),nbof_rep).

    args:
        idx_list: (type: np.array or list) a 1D array of indices.
        nbof_rep: (type: int) number of repetition.
        axis: (type: int) axis "along" which the repetition will be applied.

    return
        idx_list: (type: np.array) a 1D array of indices of size len(idx_list)*nbof_rep.

    example:
        >>> a = np.array([0, 1, 2])
        >>> repeat_idx_2d(a, 3, 0) # repeats array 'a' 3 times along 'axis' 0
        array([0, 0, 0, 1, 1, 1, 2, 2, 2])

        >>> repeat_idx_2d(a, 3, 1) # repeats array 'a' 3 times along 'axis' 1
        array([0, 1, 2, 0, 1, 2, 0, 1, 2])

        >>> b = np.reshape(np.arange(3*4), (3,4))
        >>> b[repeat_idx_2d(np.arange(3), 4, 0), repeat_idx_2d(np.arange(4), 3, 1)]
        array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
    """
    assert axis in [0,1], "Axis should be equal to 0 or 1."
    tile_axis = (nbof_rep,1) if axis else (1,nbof_rep)
    return np.reshape(np.tile(np.expand_dims(idx_list, 1),tile_axis),-1)

def conv2d(im, ker):
    """
    Performs a 'valid' 2D convolution on an image. The input image may be
    a 2D or a 3D array.

    The output image first two dimensions will be reduced depending on the 
    convolution size. 

    The kernel may be a 2D or 3D array. If 2D, it will be applied on every
    channel of the input image. If 3D, its last dimension must match the
    image one.

    args:
        im: (type: np.array) image (2D or 3D).
        ker: (type: np.array) convolution kernel (2D or 3D).

    returns:
        im: (type: np.array) convolved image.

    example:
        >>> im = np.reshape(np.arange(10*10*3),(10,10,3))/(10*10*3) # 3D image
        >>> ker = np.array([[0,1,0],[-1,0,1],[0,-1,0]]) # 2D kernel
        >>> conv2d(im, ker) # 3D array of shape (8,8,3)
    """
    if len(im.shape)==2: # if the image is a 2D array, it is reshaped by expanding the last dimension
        im = np.expand_dims(im,-1)

    im_x, im_y, im_w = im.shape

    if len(ker.shape)==2: # if the kernel is a 2D array, it is reshaped so it will be applied to all of the image channels
        ker = np.tile(np.expand_dims(ker,-1),[1,1,im_w]) # the same kernel will be applied to all of the channels 

    assert ker.shape[-1]==im.shape[-1], "Kernel and image last dimension must match."

    ker_x = ker.shape[0]
    ker_y = ker.shape[1]

    # shape of the output image
    out_x = im_x - ker_x + 1 
    out_y = im_y - ker_y + 1

    # reshapes the image to (out_x, ker_x, out_y, ker_y, im_w)
    idx_list_x = gen_idx_conv1d(im_x, ker_x) # computes the indices of a 1D conv (cf. idx_conv1d doc)
    idx_list_y = gen_idx_conv1d(im_y, ker_y)

    idx_reshaped_x = repeat_idx_2d(idx_list_x, len(idx_list_y), 0) # repeats the previous indices to be used in 2D (cf. repeat_idx_2d doc)
    idx_reshaped_y = repeat_idx_2d(idx_list_y, len(idx_list_x), 1)

    im_reshaped = np.reshape(im[idx_reshaped_x, idx_reshaped_y, :], [out_x, ker_x, out_y, ker_y, im_w]) # reshapes

    # reshapes the 2D kernel
    ker = np.reshape(ker,[1, ker_x, 1, ker_y, im_w])

    # applies the kernel to the image and reduces the dimension back to the one of original input image
    return np.squeeze(np.sum(im_reshaped*ker, axis=(1,3)))

I tried to add a lot of comments to explain the method but the global idea is to reshape the 3D input image to a 5D one of shape (output_image_height, kernel_height, output_image_width, kernel_width, output_image_channel) and then to apply the kernel directly using the basic array multiplication. Of course, this methods is then using more memory (during the execution the size of the image is thus multiply by kernel_height*kernel_width) but it is faster.

To do this reshape step, I 'over-used' the indexing methods of numpy arrays, especially, the possibility of giving a numpy array as indices into a numpy array.

This methods could also be used to re-code the 2D convolution product in Pytorch or Tensorflow using the base math functions but I have no doubt in saying that it will be slower than the existing nn.conv2d operator...

I really enjoyed coding this method by only using the numpy basic tools.

That method is quick! And the idea is clever. Basically each pixel gets the it's own convolution kernel multiplied by the surrounding pixel and summed up. And so they need to add up correctly so a box blur can be [[28,29,28],[28,29,28],[28,29,28]] as somewhat unlike other routines they need to add up to the full value to maintain brightness. — Tatarize, Nov 24 '20 at 08:19

score 1 · Answer 6 · answered Nov 26 '20 at 04:50

One of the most obvious is to hard code the kernel.

img = img.convert('L')
a = np.array(img)
out = np.zeros([a.shape[0]-2, a.shape[1]-2], dtype='float')
out += a[:-2, :-2]
out += a[1:-1, :-2]
out += a[2:, :-2]
out += a[:-2, 1:-1]
out += a[1:-1,1:-1]
out += a[2:, 1:-1]
out += a[:-2, 2:]
out += a[1:-1, 2:]
out += a[2:, 2:]
out /= 9.0
out = out.astype('uint8')
img = Image.fromarray(out)

This example does a box blur 3x3 completely unrolled. You can multiply the values where you have a different value and divide them by a different amount. But, if you honestly want the quickest and dirtiest method this is it. I think it beats Guillaume Mougeot's method by a factor of like 5. His method beating the others by a factor of 10.

It may lose a few steps if you're doing something like a gaussian blur. and need to multiply some stuff.

lovetl2002 · Answer 7 · 2022-10-11T08:12:19.600

I wrote this convolve_stride which uses numpy.lib.stride_tricks.as_strided. Moreover it supports both strides and dilation. It is also compatible to tensor with order > 2.

import numpy as np
from numpy.lib.stride_tricks import as_strided
from im2col import im2col

def conv_view(X, F_s, dr, std):
    X_s = np.array(X.shape)
    F_s = np.array(F_s)
    dr = np.array(dr)
    Fd_s = (F_s - 1) * dr + 1
    if np.any(Fd_s > X_s):
        raise ValueError('(Dilated) filter size must be smaller than X')
    std = np.array(std)
    X_ss = np.array(X.strides)
    Xn_s = (X_s - Fd_s) // std + 1
    Xv_s = np.append(Xn_s, F_s)
    Xv_ss = np.tile(X_ss, 2) * np.append(std, dr)
    return as_strided(X, Xv_s, Xv_ss, writeable=False)

def convolve_stride(X, F, dr=None, std=None):
    if dr is None:
        dr = np.ones(X.ndim, dtype=int)
    if std is None:
        std = np.ones(X.ndim, dtype=int)
    if not (X.ndim == F.ndim == len(dr) == len(std)):
        raise ValueError('X.ndim, F.ndim, len(dr), len(std) must be the same')
    Xv = conv_view(X, F.shape, dr, std)
    return np.tensordot(Xv, F, axes=X.ndim)

%timeit -n 100 -r 10 convolve_stride(A, F)
#31.2 ms ± 1.31 ms per loop (mean ± std. dev. of 10 runs, 100 loops each)

score 0 · Answer 8 · answered Jan 30 '13 at 14:51

0

Try to first round and then cast to uint8:

data = data.round().astype(np.uint8);

answered Jan 30 '13 at 14:51

Ruslan Grokhovetsky

153
6

isCzech · Answer 9 · 2022-11-04T14:05:43.303

Super simple and fast convolution using only basic numpy:

import numpy as np

def conv2d(image, kernel):
    # apply kernel to image, return image of the same shape
    # assume both image and kernel are 2D arrays
    # kernel = np.flipud(np.fliplr(kernel))  # optionally flip the kernel
    k = kernel.shape[0]
    width = k//2
    # place the image inside a frame to compensate for the kernel overlap
    a = framed(image, width)
    b = np.zeros(image.shape)  # fill the output array with zeros; do not use np.empty()
    # shift the image around each pixel, multiply by the corresponding kernel value and accumulate the results
    for p, dp, r, dr in [(i, i + image.shape[0], j, j + image.shape[1]) for i in range(k) for j in range(k)]:
        b += a[p:dp, r:dr] * kernel[p, r]
    # or just write two nested for loops if you prefer
    # np.clip(b, 0, 255, out=b)  # optionally clip values exceeding the limits
    return b

def framed(image, width):
    a = np.zeros((image.shape[0]+2*width, image.shape[1]+2*width))
    a[width:-width, width:-width] = image
    # alternatively fill the frame with ones or copy border pixels
    return a

Run it:

Image.fromarray(conv2d(image, kernel).astype('uint8'))

Instead of sliding the kernel along the image and computing the transformation pixel by pixel, create a series of shifted versions of the image corresponding to each element in the kernel and apply the corresponding kernel value to each of the shifted image versions.

This is probably the fastest you can get using just basic numpy; the speed is already comparable to C implementation of scipy convolve2d and better than fftconvolve. The idea is similar to @Tatarize. This example works only for one color component; for RGB just repeat for each (or modify the algorithm accordingly).

Black_Hat · Answer 10 · 2022-12-02T00:35:19.873

Typically, Convolution 2D is a misnomer. Ideally, under the hood, whats being done is a correlation of 2 matrices.

pad == same returns the output as the same as input dimension

It can also take asymmetric images. In order to perform correlation(convolution in deep learning lingo) on a batch of 2d matrices, one can iterate over all the channels, calculate the correlation for each of the channel slices with the respective filter slice.

For example: If image is (28,28,3) and filter size is (5,5,3) then take each of the 3 slices from the image channel and perform the cross correlation using the custom function above and stack the resulting matrix in the respective dimension of the output.

def get_cross_corr_2d(W, X, pad = 'valid'):

   if(pad == 'same'):
       pr = int((W.shape[0] - 1)/2)
       pc = int((W.shape[1] - 1)/2)
       conv_2d = np.zeros((X.shape[0], X.shape[1]))
       X_pad = np.zeros((X.shape[0] + 2*pr, X.shape[1] + 2*pc))
       X_pad[pr:pr+X.shape[0], pc:pc+X.shape[1]] = X
       for r in range(conv_2d.shape[0]):
           for c in range(conv_2d.shape[1]):
               conv_2d[r,c] = np.sum(np.inner(W, X_pad[r:r+W.shape[0], c:c+W.shape[1]]))
       return conv_2d
    
   else:    
       pr = W.shape[0] - 1
       pc = W.shape[1] - 1
       conv_2d = np.zeros((X.shape[0] - W.shape[0] + 2*pr + 1,
                           X.shape[1] - W.shape[1] + 2*pc + 1))
       X_pad = np.zeros((X.shape[0] + 2*pr, X.shape[1] + 2*pc))
       X_pad[pr:pr+X.shape[0], pc:pc+X.shape[1]] = X
       for r in range(conv_2d.shape[0]):
           for c in range(conv_2d.shape[1]):
               conv_2d[r,c] = np.sum(np.multiply(W, X_pad[r:r+W.shape[0], c:c+W.shape[1]]))
       return conv_2d

score -2 · Answer 11 · answered Jul 03 '15 at 08:04

-2

This code incorrect:

for r in range(nr):
    data[r,:] = np.convolve(data[r,:], H_r, 'same')

for c in range(nc):
    data[:,c] = np.convolve(data[:,c], H_c, 'same')

See Nussbaumer transformation from multidimentional convolution to one dimentional.

answered Jul 03 '15 at 08:04

user5076897

11

2

This answer is very vague. What's incorrect about it? What aspect to the Nussbaumer transformation are you referring to? – Hannes Johansson Jul 03 '15 at 08:10

2d convolution using python and numpy

11 Answers11

Edit [Jan 2019]

Possible Problem

Possible Solution

Linked

Related