8

As I get to implement a sliding window using python to detect objects in still images, I get to know the nice function:

numpy.lib.stride_tricks.as_strided

So I tried to achieve a general rule to avoid mistakes I may fail in while changing the size of the sliding windows I need. Finally I got this representation:

all_windows = as_strided(x,((x.shape[0] - xsize)/xstep ,(x.shape[1] - ysize)/ystep ,xsize,ysize), (x.strides[0]*xstep,x.strides[1]*ystep,x.strides[0],x.strides[1])

which results in a 4 dim matrix. The first two represents the number of windows on the x and y axis of the image. and the others represent the size of the window (xsize,ysize)

and the step represents the displacement from between two consecutive windows.

This representation works fine if I choose a squared sliding windows. but still I have a problem in getting this to work for windows of e.x. (128,64), where I get usually unrelated data to the image.

What is wrong my code. Any ideas? and if there is a better way to get a sliding windows nice and neat in python for image processing?

Thanks

JustInTime
  • 2,716
  • 5
  • 22
  • 25
  • Since you were looking for a template matching algorithms, [`this post`](http://stackoverflow.com/a/41379596/3293881) might be worth a look that uses strides. – Divakar Jan 07 '17 at 12:49

4 Answers4

4

There is an issue in your code. Actually this code work good for 2D and no reason to use multi dimensional version (Using strides for an efficient moving average filter). Below is a fixed version:

A = np.arange(100).reshape((10, 10))
print A
all_windows = as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
      (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
print all_windows
Community
  • 1
  • 1
Roman Podlinov
  • 23,806
  • 7
  • 41
  • 60
3

Check out the answers to this question: Using strides for an efficient moving average filter. Basically strides are not a great option, although they work.

Community
  • 1
  • 1
Benjamin
  • 11,560
  • 13
  • 70
  • 119
  • So is there any better alternative that is fast and yet neat for extracting sliding windows at different scales for the sake of detection and template matching algorithms ?? – JustInTime Sep 26 '11 at 07:59
  • I need to extract such subwindows where to get some HOG features over the window, and classify this instance against already trained classifier to check if it is a subwindow I am concerned about or not. – JustInTime Sep 27 '11 at 08:47
  • 1
    For the argument sake, yes something similar, regarding the HOG, I have previously implemented it using nested loops to extract its cells histogram, I thought that the stride tricks would save some performance and eliminate the usage of 2 loops that is really expensive in this context using a language like python or matlab. – JustInTime Sep 27 '11 at 20:42
  • @JustInTime hi, were you able to solve your problem? I have some more questions regarding the sliding window approach, is it possible to contact you? – user961627 May 16 '14 at 06:35
  • 1
    Take a look at [view_as_windows](http://scikit-image.org/docs/0.10.x/api/skimage.util.html#view-as-windows) Hope it will help. – Gilberto Oct 25 '16 at 11:59
1

For posteriority:

This is implemented in scikit-learn in the function sklearn.feature_extraction.image.extract_patches.

eickenberg
  • 14,152
  • 1
  • 48
  • 52
  • I believe this method makes a copy from the original image for the patches, and it seems the striding method is intended specifically to avoid doing that. – Alex Klibisz Feb 14 '17 at 19:00
  • As an alternative, see skimage.util.view_as_windows (http://scikit-image.org/docs/dev/api/skimage.util.html#skimage.util.view_as_windows) – Alex Klibisz Feb 14 '17 at 20:18
  • @AlexKlibisz `extract_patches` uses strides in the most general form (works for all `ndarray`s of arbitrary dimension and you can extract arbitrarily shaped patches at arbitrary steps). `extract_patches_2d` uses this function, but calls a reshape and thus induces a copy (this is wanted for the 2D case). [full disclosure: I wrote `extract_patches`] – eickenberg Feb 15 '17 at 16:18
  • It looks like the `skimage` function implements the exact same functionality. For completeness, here is the [source of `extract_patches`](https://github.com/scikit-learn/scikit-learn/blob/0.18.X/sklearn/feature_extraction/image.py#L242) – eickenberg Feb 15 '17 at 16:24
  • You're right, my mistake. I mistakenly used `extract_patches_2d` instead of `extract_patches`. – Alex Klibisz Feb 15 '17 at 16:52
1

I had a similar use-case where I needed to create sliding windows over a batch of multi-channel images and ended up coming up with the below function. I've written a more in-depth blog post covering this in regards to manually creating a Convolution layer. This function implements the sliding windows and also includes dilating or adding padding to the input array.

The function takes as input:

input - Size of (Batch, Channel, Height, Width) output_size - Depends on usage, comments below. kernel_size - size of the sliding window you wish to create (square) padding - amount of 0-padding added to the outside of the (H,W) dimensions stride - stride the sliding window should take over the inputs dilate - amount to spread the cells of the input. This adds 0-filled rows/cols between elements

Typically, when performing forward convolution, you do not need to perform dilation so your output size can be found be using the following formula (replace x with input dimension):

(x - kernel_size + 2 * padding) // stride + 1

When performing the backwards pass of convolution with this function, use a stride of 1 and set your output_size to the size of your forward pass's x-input

Sample code with an example of using this function can be found at this link.

def getWindows(input, output_size, kernel_size, padding=0, stride=1, dilate=0):
    working_input = input
    working_pad = padding
    # dilate the input if necessary
    if dilate != 0:
        working_input = np.insert(working_input, range(1, input.shape[2]), 0, axis=2)
        working_input = np.insert(working_input, range(1, input.shape[3]), 0, axis=3)

    # pad the input if necessary
    if working_pad != 0:
        working_input = np.pad(working_input, pad_width=((0,), (0,), (working_pad,), (working_pad,)), mode='constant', constant_values=(0.,))

    in_b, in_c, out_h, out_w = output_size
    out_b, out_c, _, _ = input.shape
    batch_str, channel_str, kern_h_str, kern_w_str = working_input.strides

    return np.lib.stride_tricks.as_strided(
        working_input,
        (out_b, out_c, out_h, out_w, kernel_size, kernel_size),
        (batch_str, channel_str, stride * kern_h_str, stride * kern_w_str, kern_h_str, kern_w_str)
    )
Slvrfn
  • 644
  • 2
  • 7
  • 17