2

short story:

This is a follow up question to: Fast Way to slice image into overlapping patches and merge patches to image

How must I adapt the code provided in the answer to work not only on images of size x,y, where a pixel is described by a float, but described by a matrix of size 3,3?

Further, how to adapt the code so that it returns a generator allowing me to iterate over all patches without having to save all of them in memory?

long story:

Given an image of shape (x,y), where each pixel is described by a (3,3) matrix. This can be described as a matrix of shape (x,y,3,3). Further given a target patchsize such as (11,11), I want to extract all overlapping patches from the image (x,y).

Note that I do not want to get all patches from the matrix x,y,3,3 but from the image x,y where each pixel is a matrix.

I will want to use these patches for a patch classification algorithm, effectively iterating over all patches, extracting features and learning a classifier. Yet given a huge image and large patchsize, there is no way to perform this operation without hurting the limitation of the memory.

Possible solutions:

Therefore the question is: How can I adapt this code to fit the new input data?

def patchify(img, patch_shape):
    img = np.ascontiguousarray(img)  # won't make a copy if not needed
    X, Y = img.shape
    x, y = patch_shape
    shape = ((X-x+1), (Y-y+1), x, y) # number of patches, patch_shape
    # The right strides can be thought by:
    # 1) Thinking of `img` as a chunk of memory in C order
    # 2) Asking how many items through that chunk of memory are needed when indices
    #    i,j,k,l are incremented by one
    strides = img.itemsize*np.array([Y, 1, Y, 1])
    return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
Nikolas Rieble
  • 2,416
  • 20
  • 43

1 Answers1

3

While the answer you link is not incorrect, I'd argue it is better to not make assumptions over the strides of the array and simply reuse whatever strides it already has. It has the added benefit of never requiring a copy of the original array, even if it is not contiguous. For your extended image shape you would do:

def patchify(img, patch_shape):
    X, Y, a, b = img.shape
    x, y = patch_shape
    shape = (X - x + 1, Y - y + 1, x, y, a, b)
    X_str, Y_str, a_str, b_str = img.strides
    strides = (X_str, Y_str, X_str, Y_str, a_str, b_str)
    return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)

It is easy to get carried away and want to write some more general function that doesn't require specialization for a particular array dimensionality. If you feel the need to go there, you may find some inspiration in this gist.

Jaime
  • 65,696
  • 17
  • 124
  • 159
  • If I understood it correctly, this function returns a view and does not allocate memory. Yet when operation on this array starts, a copy might be created. How can I prevent it? Possibly returning a generator instead? – Nikolas Rieble Jan 25 '17 at 11:30
  • 1
    If you iterate over the patches, e.g. `X, Y = img.shape[:2]; for x in range(X): for y in range(Y): patch = img[x, y]; ...`, then whatever you do with `patch` will at most copy that particular patch only. It will slow things down, but it will not take over a huge amount of memory, as would happen if the whole patched view were copied at once. Other than that, whether a copy can be avoided is highly dependent on what you want to do with the patches. – Jaime Jan 25 '17 at 11:36
  • @NikolasRieble So, the best possible use-case is using some sort of reduction on this `6D` array, but generally speaking would depend on the use-case itself. – Divakar Jan 25 '17 at 11:42
  • @Divakar A list of all patches would contain much more elements than the original image. How could this be done using a reduction? – Nikolas Rieble Jan 25 '17 at 11:48
  • @NikolasRieble I didn't say using reduction to create this 6D array, I meant if you are using some reduction code on this 6D array would justify creating such a huge array. – Divakar Jan 25 '17 at 11:53