Fast split and reassemble image into overlapping patches with numpy

Question

UPD: I've made a small reproducible example (see end of the post).

I want to split image into overlapping patches which are processed by another module, and then collect them back as fast as I can.

I tried to use numpy views (as_strided) function to create a read-only view of source array, but it does not seem to produce patches on the border of an image, if it cannot be covered by tiles precisely. This can be solved by padding image to correct size, but the time code runs with padding becomes comparable with a simple tiling loop then (0.015s -> 0.5s).

To collect patches back I use loop to fill empty image, with 1/2 of initial overlap (the reason is that my image processing algorithm can damage patch borders). The time to process patches with loop is 0.5s for 4960x3500 and patch size 1000 on my system.

Is there any way to make it faster (make as_strided work in my case for splitting), or even have reassembled image as a view of patches?

Images: 4960x3500, 4928x3264.jpg

Code:

import time
import numpy as np
from numpy.lib.stride_tricks import as_strided
from scipy.misc import imread, imsave, imresize


# the code produces read-only view of an image in 0.015s
# function taken from
# https://stackoverflow.com/questions/45960192/
def window_nd(a, window, steps = None, axis = None, outlist = False):
    """
    Create a windowed view over `n`-dimensional input that uses an 
    `m`-dimensional window, with `m <= n`

    Parameters
    -------------
    a : Array-like
        The array to create the view on

    window : tuple or int
        If int, the size of the window in `axis`, or in all dimensions if 
        `axis == None`

        If tuple, the shape of the desired window.  `window.size` must be:
            equal to `len(axis)` if `axis != None`, else 
            equal to `len(a.shape)`, or 
            1

    steps : tuple, int or None
        The offset between consecutive windows in desired dimension
        If None, offset is one in all dimensions
        If int, the offset for all windows over `axis`
        If tuple, the steps along each `axis`.  
            `len(steps)` must me equal to `len(axis)`

    axis : tuple, int or None
        The axes over which to apply the window
        If None, apply over all dimensions
        if tuple or int, the dimensions over which to apply the window

    outlist : boolean
        If output should be as list of windows.  
        If False, it will be an array with 
            `a.nidim + 1 <= a_view.ndim <= a.ndim *2`.  
        If True, output is a list of arrays with `a_view[0].ndim = a.ndim`
            Warning: this is a memory-intensive copy and not a view

    Returns
    -------

    a_view : ndarray
        A windowed view on the input array `a`, or copied list of windows   

    """
    ashp = np.array(a.shape)

    if axis != None:
        axs = np.array(axis, ndmin = 1)
        assert np.all(np.in1d(axs, np.arange(ashp.size))), \
            "Axes out of range"
    else:
        axs = np.arange(ashp.size)

    window = np.array(window, ndmin = 1)
    assert (window.size == axs.size) | (window.size == 1), \
        "Window dims and axes don't match"
    wshp = ashp.copy()
    wshp[axs] = window
    assert np.all(wshp <= ashp), "Window is bigger than input array in axes"

    stp = np.ones_like(ashp)
    if steps:
        steps = np.array(steps, ndmin = 1)
        assert np.all(steps > 0), \
            "Only positive steps allowed"
        assert (steps.size == axs.size) | (steps.size == 1), \
            "Steps and axes don't match"
        stp[axs] = steps

    astr = np.array(a.strides)

    shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
    strides = tuple(astr * stp) + tuple(astr)

    as_strided = np.lib.stride_tricks.as_strided
    a_view = np.squeeze(as_strided(a, 
                                 shape = shape, 
                                 strides = strides, writeable=False))
    if outlist:
        return list(a_view.reshape((-1,) + tuple(wshp)))
    else:
        # return view (N, p_h, p_w, channels)
        return a_view.reshape((-1,) + tuple(wshp)) #a_view




# split image into patches. If padding is used, everything is ok
def patchify(img, patch_shape=(1000,1000), overlap=10):

    img_h, img_w = img.shape[:2]
    p_h, p_w = patch_shape[:2]


    '''

    # REMOVE: Padding makes the code work

    # calculate number of patches needed
    n_h = (img_h - overlap) // (p_h - overlap) 

    if (img_h - overlap) % (p_h - overlap) > 0:
        n_h += 1

    n_w = (img_w - overlap) // (p_w - overlap)

    if (img_w - overlap) % (p_w - overlap) > 0:
        n_w += 1

    h_new = (p_h - overlap)*n_h + overlap
    w_new = (p_w - overlap)*n_w + overlap

    pad_h, pad_w = h_new - img_h + 1, w_new - img_w + 1
    img = np.pad(img, ((0, pad_h), (0, pad_w), (0, 0)), 'mean')
    '''


    return window_nd(img, (p_h, p_w), \
        steps=(p_h-overlap,p_w-overlap), axis=(0,1))



# simple loop to collect image back with overlap
def collect(patches, image_size, overlap=10):    

    '''

    If you have m windows of length k with an overlap of r, 
    then the total distance span, n, is given as

    n = (k-r)m + r

    Hence the number of windows, m, is

    m = (n-r)/(k-r)


    '''    

    img_h, img_w = image_size[:2]

    print('image_size {}'.format( image_size))

    n_p, p_h, p_w = patches.shape[:3]

    print('patches.shape {}'.format(patches.shape))

    # calculate number of patches needed, including overlapping ones
    n_h = (img_h - overlap) // (p_h - overlap) 

    if (img_h - overlap) % (p_h - overlap) > 0:
        n_h += 1

    n_w = (img_w - overlap) // (p_w - overlap)

    #if (img_w - overlap) % (p_w - overlap) > 0:
    #    n_w += 1


    img = np.zeros((img_h, img_w, image_size[2]), dtype=patches.dtype)


    patch_idx = 0
    pos_h = 0
    pos_w = 0

    # we know that this image size is sufficient, 
    # so we cut everything which does not fit
    for i in range(n_h):

        patch_offset_h = overlap//2 if i > 0 else 0

        height_left = img_h - pos_h 

        # overlap is needed for correctness
        h_to_insert = np.min([p_h - patch_offset_h, height_left])

        for j in range(n_w):

            p = patches[patch_idx]

            patch_offset_w = overlap//2 if j > 0 else 0

            width_left = img_w - pos_w 

            w_to_insert = np.min([p_w - patch_offset_w, width_left])

            print('h:{}, w:{}, h_i:{}, w_i:{}'.format(pos_h, pos_w,\
                h_to_insert, w_to_insert))

            # watching carefully the size of parts we copy
            img[pos_h:(pos_h+h_to_insert),pos_w:(pos_w+w_to_insert),:] = \
                    p[patch_offset_h:(h_to_insert + patch_offset_h ),    \
                      patch_offset_w:(w_to_insert + patch_offset_w), :]

            pos_w += w_to_insert - overlap // 2

            patch_idx += 1

            print('patch {}/{}'.format(patch_idx, len(patches)))

            ### REMOVE: to save what we actually have if the number of patches is less
            if patch_idx > len(patches) - 1:
                return img

        pos_w = 0    
        pos_h += h_to_insert - overlap // 2


    return img




# Test

# this image has last row of patches not calculated
#image = imread('4960x3500.jpg')

# this image has both last row and last column not calculated
# You can see that by commenting out 
'''
if (img_w - overlap) % (p_w - overlap) > 0:
        n_w += 1
'''
#in collect()

image = imread('4928x3264.jpg')

start_time = time.clock()

patches = patchify(image)

if not type(patches) is np.ndarray:
    patches = np.array(patches)

print("Patchify took: {}s".format(time.clock() - start_time))

start_time = time.clock()
res_image = collect(patches, (image.shape[0], image.shape[1], \
    image.shape[2]), overlap=10)

print("Collect took: {}s".format(time.clock() - start_time))

imsave('out.png', res_image)

UPD: Smaller example. Here I use window_nd() and skimage.util.view_as_windows trying to patchify image. The output is different from the number of patches needed to cover an array. For example, for array with dimensions (11, 15) and patch (6,6) without overlap one needs 2x3 patches. I understand, that the problem is that numpy does not produce patches with empty data for overlapping area. Can this be solved somehow efficiently?

import time
import numpy as np
from numpy.lib.stride_tricks import as_strided
from scipy.misc import imread, imsave, imresize


from skimage.util import view_as_windows


# the code produces read-only view of an image in 0.015s
# function taken from
# https://stackoverflow.com/questions/45960192/
def window_nd(a, window, steps = None, axis = None, outlist = False):
    """
    Create a windowed view over `n`-dimensional input that uses an 
    `m`-dimensional window, with `m <= n`

    Parameters
    -------------
    a : Array-like
        The array to create the view on

    window : tuple or int
        If int, the size of the window in `axis`, or in all dimensions if 
        `axis == None`

        If tuple, the shape of the desired window.  `window.size` must be:
            equal to `len(axis)` if `axis != None`, else 
            equal to `len(a.shape)`, or 
            1

    steps : tuple, int or None
        The offset between consecutive windows in desired dimension
        If None, offset is one in all dimensions
        If int, the offset for all windows over `axis`
        If tuple, the steps along each `axis`.  
            `len(steps)` must me equal to `len(axis)`

    axis : tuple, int or None
        The axes over which to apply the window
        If None, apply over all dimensions
        if tuple or int, the dimensions over which to apply the window

    outlist : boolean
        If output should be as list of windows.  
        If False, it will be an array with 
            `a.nidim + 1 <= a_view.ndim <= a.ndim *2`.  
        If True, output is a list of arrays with `a_view[0].ndim = a.ndim`
            Warning: this is a memory-intensive copy and not a view

    Returns
    -------

    a_view : ndarray
        A windowed view on the input array `a`, or copied list of windows   

    """
    ashp = np.array(a.shape)

    if axis != None:
        axs = np.array(axis, ndmin = 1)
        assert np.all(np.in1d(axs, np.arange(ashp.size))), \
            "Axes out of range"
    else:
        axs = np.arange(ashp.size)

    window = np.array(window, ndmin = 1)
    assert (window.size == axs.size) | (window.size == 1), \
        "Window dims and axes don't match"
    wshp = ashp.copy()
    wshp[axs] = window
    assert np.all(wshp <= ashp), "Window is bigger than input array in axes"

    stp = np.ones_like(ashp)
    if steps:
        steps = np.array(steps, ndmin = 1)
        assert np.all(steps > 0), \
            "Only positive steps allowed"
        assert (steps.size == axs.size) | (steps.size == 1), \
            "Steps and axes don't match"
        stp[axs] = steps

    astr = np.array(a.strides)

    shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
    strides = tuple(astr * stp) + tuple(astr)

    as_strided = np.lib.stride_tricks.as_strided
    a_view = np.squeeze(as_strided(a, 
                                 shape = shape, 
                                 strides = strides, writeable=False))
    if outlist:
        return list(a_view.reshape((-1,) + tuple(wshp)))
    else:
        # return view (N, p_h, p_w, channels)
        return a_view #a_view.reshape((-1,) + tuple(wshp)) #a_view





# split image into patches. If padding is used, everything is ok
def patchify(img, patch_shape=(6,6), overlap=3):

    img_h, img_w = img.shape[:2]
    p_h, p_w = patch_shape[:2]




    #return window_nd(img, (p_h, p_w), \
    #    steps=(p_h-overlap,p_w-overlap), axis=(0,1))

    # produces sometimes different results, but also errorneous
    w=  view_as_windows(img, (p_h, p_w), (p_h-overlap,p_w-overlap))

    return w




arr = np.array([[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
               [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]])


print(arr.shape)
img_h = 11
img_w = 15
# patch dimensions
p_h = 6
p_w = 6
overlap = 0

# calculate number of patches we need

n_h = (img_h - overlap) // (p_h - overlap) 

if (img_h - overlap) % (p_h - overlap) > 0:
    n_h += 1

n_w = (img_w - overlap) // (p_w - overlap)

if (img_w - overlap) % (p_w - overlap) > 0:
    n_w += 1

print('number of patches needed, h:{}, w:{}'.format(n_h,n_w))


patches = patchify(arr, patch_shape=(p_h, p_w), overlap=overlap)

print('split we actually have: {}'.format(patches.shape))

#print(patches)

could you please reduce the amount of code to a minimum working example of your bottleneck? Furthermore please add an expected output. — JE_Muc, Mar 05 '19 at 12:25
@Scotty1 I don't know how to reduce the code in order to have output image shown. The bottleneck is that `window_nd` function I took from https://stackoverflow.com/questions/45960192/ does not create patches for the border of an image, as shown on pictures. — Slowpoke, Mar 05 '19 at 12:27
@Scotty1- I could have removed some complications like patches overlap, but they are essential, and the solution which does not support overlapping will not work for me — Slowpoke, Mar 05 '19 at 12:29
Have you had a look at [`skimage.util.view_as_windows()`](http://scikit-image.org/docs/dev/api/skimage.util.html#skimage.util.view_as_windows)? It does exactly what you want. — Nils Werner, Mar 05 '19 at 12:29
@NilsWerner I vaguely remember from my earlier experiments that it might have the same issue, I will test it now and answer you — Slowpoke, Mar 05 '19 at 12:30
@NilsWerner I might be using it not correct, the following code works exactly the same for me instead of `window_nd()` call: `w= view_as_windows(img, (p_h, p_w, 3), (p_h-overlap,p_w-overlap, 1))` `w = w.reshape((w.shape[0]*w.shape[1], w.shape[3],w.shape[4],w.shape[5]))` `return w` — Slowpoke, Mar 05 '19 at 12:46
this is entirely too much code, no one is going to sift through all that. if you’re looking for a performance boost I suggest checking out code review, otherwise you will have to trim this down a lot — gold_cy, Mar 05 '19 at 12:47
@aws_apprentice, I understand, sorry for that. I will try to create example with small arrays somehow. But perhaps someone has already faced the problem of patchifying the image. — Slowpoke, Mar 05 '19 at 12:49
I believe `view_as_windows` would be cleaner. Could you replace `window_nd` with it? — Divakar, Mar 05 '19 at 13:34
@Divakar I've made smaller example which shows that `view_as_windows` works the same way. For array `(11,15)` and 6x6 patches without overlap it produces 2x2=4 patches instead of 2x3 — Slowpoke, Mar 05 '19 at 13:36
In my small example it's 2D array, in the initial one it's 3D image @Divakar — Slowpoke, Mar 05 '19 at 13:43
Your `patchify(image).shape` is `(12,1000,1000,3)` which came from a `(4x3)` block structure, each of `(1000,1000,3)`. Would it be okay to get the patches that way, perform your intended operations and then re-assemble? I am proposing that way because that `patchify(image)` isn't a view into `image`, but if you can keep it as `view_as_windows(image, (1000, 1000, 3),step=(990,990,1))[:,:,0,]`, which is of shape `(4, 3, 1000, 1000, 3)` and it's a view and hence would be memory efficient and easy to reassemble. — Divakar, Mar 05 '19 at 13:52
@Divakar I do not use the information about block structure and process patches in batches, so `(12, 1000,1000,3)` suits me well. I re-calculate 4x3 align again in my `collect()` function using formula, as long as order of the patches was kept intact. I thought `a_view.reshape((-1,) + tuple(wshp))` does not make copy in memory, because that code was pretty fast. — Slowpoke, Mar 05 '19 at 14:03
So, just use `view_as_windows` keeping that `(4,3)` kind of structure to get `patchify` and use that structure inside `collect`. — Divakar, Mar 05 '19 at 14:12
@Divakar The problem is that I have fixed patch size and overlap, so the picture might be not covered by tiles precisely. In this case both `view_as_windows` and `as_strided` does not produce correct result. I can pad the image for correct size of course, but the code becomes slow like a simple loop. My question was whether it can be done more efficiently. — Slowpoke, Mar 05 '19 at 14:15
Padding up-front won't be that bad, because everything else later on would be `views`. — Divakar, Mar 05 '19 at 14:19
@Divakar Surprisingly, but with that I have almost the same execution time for `patchify()` and `collect()`, where the second one uses loop. So padding wihch creates new image and copies contents makes usage of views inefficient — Slowpoke, Mar 05 '19 at 14:25

Fast split and reassemble image into overlapping patches with numpy

0 Answers0