1

Currently, I have a 4d array, say,

arr = np.arange(48).reshape((2,2,3,4))

I want to apply a function that takes a 2d array as input to each 2d array sliced from arr. I have searched and read this question, which is exactly what I want.

The function I'm using is im2col_sliding_broadcasting() which I get from here. It takes a 2d array and list of 2 elements as input and returns a 2d array. In my case: it takes 3x4 2d array and a list [2, 2] and returns 4x6 2d array.

I considered using apply_along_axis() but as said it only accepts 1d function as parameter. I can't apply im2col function this way.

I want an output that has the shape as 2x2x4x6. Surely I can achieve this with for loop, but I heard that it's too time expensive:

import numpy as np

def im2col_sliding_broadcasting(A, BSZ, stepsize=1):
    # source: https://stackoverflow.com/a/30110497/10666066
    # Parameters
    M, N = A.shape
    col_extent = N - BSZ[1] + 1
    row_extent = M - BSZ[0] + 1

    # Get Starting block indices
    start_idx = np.arange(BSZ[0])[:, None]*N + np.arange(BSZ[1])

    # Get offsetted indices across the height and width of input array
    offset_idx = np.arange(row_extent)[:, None]*N + np.arange(col_extent)

    # Get all actual indices & index into input array for final output
    return np.take(A, start_idx.ravel()[:, None] + offset_idx.ravel()[::stepsize])

arr = np.arange(48).reshape((2,2,3,4))
output = np.empty([2,2,4,6])

for i in range(2):
    for j in range(2):
        temp = im2col_sliding_broadcasting(arr[i, j], [2,2])
        output[i, j] = temp

Since my arr in fact is a 10000x3x64x64 array. So my question is: Is there another way to do this more efficiently ?

Divakar
  • 218,885
  • 19
  • 262
  • 358
AcaNg
  • 704
  • 1
  • 9
  • 26
  • `apply_along_axis` even where is works is not a time saver. It just makes iterating over the other dimensions easier. – hpaulj Jul 16 '19 at 03:43
  • @hpaulj yeah I know, but I can't even use `apply_along_axis` in this case =))) – AcaNg Jul 16 '19 at 03:45
  • So you aren't loosing anything by not being able to use it. – hpaulj Jul 16 '19 at 03:47
  • @hpaulj It's what pop up in my mind before I do looping. And then I realize it's too slow and I can't find a way to transform arr into something that can work with it – AcaNg Jul 16 '19 at 03:51
  • 1
    **approach #2** in your linked answer can be extended to work with the whole (2,2,3,4) array. **#3** also, though I'm less familiar with its internals. These strided methods initially create a `view`, which has a low memory footprint. But it's easy to trigger a copy, which could end up giving a memory error. – hpaulj Jul 16 '19 at 04:05
  • @hpaulj I don't understand too much about stride, but I'm gonna try it. Thank you very much. – AcaNg Jul 16 '19 at 04:44
  • 1
    Use `scikit-image's view_as_windows` : `from skimage.util import view_as_windows as viewW; output = viewW(arr,(1,1,2,2))[...,0,0,:,:].transpose(0,1,2,4,3,5).reshape(2,2,4,6)`. If you need a view output, skip the last reshaping. More info - https://stackoverflow.com/a/51890064 – Divakar Jul 16 '19 at 05:08
  • @Divakar though I'm still confusing a bit, it works. Thank you so much. Please post this as an answer so I will marked this as accepted. – AcaNg Jul 16 '19 at 05:59

1 Answers1

1

We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.

from skimage.util.shape import view_as_windows

W1,W2 = 2,2 # window size

# create sliding windows along last two axes1
w = view_as_windows(arr,(1,1,W1,W2))[...,0,0,:,:]

# Merge the window axes (tha last two axes) and 
# merge the axes along which those windows were created (3rd and 4th axes)
outshp = arr.shape[:-2] + (W1*W2,) + ((arr.shape[-2]-W1+1)*(arr.shape[-1]-W2+1),)
out = w.transpose(0,1,4,5,2,3).reshape(outshp)

The last step forces a copy. So, skip it if possible.

Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Since each 2d array in my output is a group of columns, I would use `reshape(2,2,6,4)` and then `np.einsum('ijkl->ijlk', out)` to get the final result. And yeah, I have to create a copy, but I'm gonna delete it later so it doesn't bother me. Again, thank you. – AcaNg Jul 16 '19 at 06:55