0

I need to slice sections out of a NumPy array in a specific way. Say I have a (200,200, 4) shape NumPy array. Then for every index in (200, 200), I want to select the 5x5x4 surrounding indexes, flatten it, and then put it into another array. So finally, the shape of the final array would be (200, 200, 100). Additionally, I want to delete all values at the location (:, :, 12). So finally, we'd get shape (200, 200, 99).

I've thought of two ways to go about this but they give different results and I'm not sure what I'm doing wrong.

Method 1:

import numpy as np

arr_lst = [np.random.normal(size=(200, 200)) for _ in range(4)]
slice_arr = np.zeros([200, 200, 99])

start = 0
for i, arr in enumerate(arr_lst):
    for idx, _ in np.ndenumerate(arr):


        #Getting surrounding 25 pixels
        pos_arr = arr[idx[0]-2:idx[0]+3, idx[1]-2:idx[1]+3]
        
        #Reshaping, into size 100
        pos_arr = pos_arr.reshape(-1)

        #Near the boundaries slicing does not result in size 25
        if pos_arr.shape[0] != 25:
            pos_arr = np.full(25, np.nan)

        if i == 0:
            pos_arr = np.delete(pos_arr, 12)
            end = start + 25 - 1
        else:
            end = start + 25

        slice_arr[idx[0], idx[1], start:end] = pos_arr


    start = end

print(slice_arr[10, 100])

Method 2:

import numpy as np

arr_lst = [np.random.normal(size=(200, 200)) for _ in range(4)]      
stacked_arr = np.stack(arr_lst, axis=2)

slice_arr = np.zeros([200, 200, 100])

for i in range(200):
    for j in range(200):
        x = stacked_arr[i-2:i+3, j-2:j+3, 0:4]
        if x.shape != (5, 5, 4):
            x = np.array([np.nan for _ in range(100)])
        else:
            x = x.reshape(100)
        slice_arr[i,j] = x

slice_arr = np.delete(slice_arr, 12, 2)

print(slice_arr[10, 100])

The first method gives me the array that I want in the correct order, but the second method feels more natural and faster. Another question I have is if I can optimize this at all? Is there a fast way for slicing around every index at the same time and keeping each slice the same shape? Then afterwards, deleting what things we want to?

hpaulj
  • 221,503
  • 14
  • 230
  • 353
kauii8
  • 199
  • 9
  • Looks like a task for `as_strided`, or a windowing function based on it. – hpaulj Jul 01 '20 at 16:52
  • https://stackoverflow.com/questions/61711831/rolling-windows-for-ndarrays and `view_as_windows`, https://scikit-image.org/docs/dev/api/skimage.util.html#skimage.util.view_as_windows – hpaulj Jul 01 '20 at 17:07
  • Does this answer your question? [Rolling windows for ndarrays](https://stackoverflow.com/questions/61711831/rolling-windows-for-ndarrays) – bnaecker Jul 01 '20 at 17:13
  • Hmm, thanks for the responses. I think this is almost what I'm looking for but not quite. From reading the link you posted + documentation on skimage-view_as_windows it appears that you can only specify the size of the window you want to cut out and it will take the slices of the given size from the "top-left". However, I want to slice around the array ```[i-2:i+3, j-2:j+3, 0:4]```, so around the middle. In other words, I can't specify the window size and location? – kauii8 Jul 01 '20 at 18:17
  • How about adding a border of nan all around the array? Then you can get the windows without special edge treatment. – hpaulj Jul 01 '20 at 20:42

1 Answers1

0

Using @hpaulj helpful comments I designed a solution that I think works for my purposes. It's similar to what was suggested here: Rolling windows for ndarrays but has the additional border of np.nan values. If anyone else finds this useful I've posted it here, for debugging purposes, I've set the values in the padded array to coordinate tuples:

from skimage.util.shape import view_as_windows

arr_lst = [np.empty(shape=(200, 200), dtype=tuple) for _ in range(4)]
arr_lst = [np.pad(x, pad_width=2, mode='constant', constant_values=np.nan) for x in arr_lst]
padded_arr = np.stack(arr_lst, axis=2)

for idx, _ in np.ndenumerate(padded_arr):
    padded_arr[idx[0], idx[1], idx[2]] = idx

w = view_as_windows(padded_arr, (5, 5, 4)).reshape(200, 200, 100)

kauii8
  • 199
  • 9