2

I'm trying to implement fprop for MaxPooling layer in Conv Networks with no overlapping and pooling regions 2x2. To do so, I need to split my input matrix into matrices of size 2x2 so that I can extract the maximum. I am then creating a mask which I can use later on in bprop. To carry out the splitting I am splitting my input matrix first vertically and then horizontally and then finding the maximum using vsplit, hsplit and amax respectively. This keeps crashing however with index out of bounds exceptions and I am not sure where the error is. Is there a simpler way to split the 24 x 24 input matrix into 144 2x2 matrices so that I can obtain the maximum.

I am doing the following to do so:

for i in range(inputs.shape[0]):
        for j in range(inputs.shape[1]):
            for k in range(inputs.shape[2] // 2):
                for h in range(inputs.shape[3] // 2):

                    outputs[i,j,k,h] = np.amax(np.hsplit(np.vsplit(inputs[i,j], inputs.shape[2] // 2)[k], inputs.shape[1] // 2)[h])

                    max_ind = np.argmax(np.hsplit(np.vsplit(inputs[i,j], inputs.shape[2] // 2)[k], inputs.shape[1] // 2)[h])

                    max_ind_y = max_ind // inputs.shape[2]

                    if (max_ind_y == 0):
                        max_ind_x = max_ind
                    else:
                        max_ind_x = max_ind % inputs.shape[3]

                    self.mask[i,j,max_ind_y + 2 * k, max_ind_x + 2 * h] = outputs[i,j,k,h]

EDIT:

This is the output produced by reshape:

enter image description here

What I would like instead is

[0 1 
 4 5]

[2 3 
 6 7]

and so on...

Alk
  • 5,215
  • 8
  • 47
  • 116

2 Answers2

3

This is implemented as view_as_blocks in skimage.util:

blocks = skimage.util.view_as_blocks(a,(2,2))
maxs = blocks.max((2,3))
B. M.
  • 18,243
  • 2
  • 35
  • 54
1

Step #1 : Getting max_ind_x, max_ind_y

We need to get the row, column indices of the max element per block -

m,n = inputs.shape
a = inputs.reshape(m//2,2,n//2,2).swapaxes(1,2)
row, col = np.unravel_index(a.reshape(a.shape[:-2] + (4,)).argmax(-1), (2,2))

Step #2 : Setting output array with argmax places from the input

Then, looking at your code it seems you are trying to create an output array with those argmax places set with values from the input array. Hence, we could do -

out = np.zeros_like(a)
M,N = a.shape[:2]
indx_tuple = np.arange(M)[:,None],np.arange(N), row, col
out[indx_tuple] = a[indx_tuple]

Finally, we could get the 2D shape back for the output and this would be a good verification step against the original input inputs -

out2d = out.reshape(a.shape[:2]+(2,2)).swapaxes(1,2).reshape(m,n)

Sample input, output -

In [291]: np.random.seed(0)
     ...: inputs = np.random.randint(11,99,(6,4))

In [292]: inputs
Out[292]: 
array([[55, 58, 75, 78],
       [78, 20, 94, 32],
       [47, 98, 81, 23],
       [69, 76, 50, 98],
       [57, 92, 48, 36],
       [88, 83, 20, 31]])

In [286]: out2d
Out[286]: 
array([[ 0,  0,  0,  0],
       [78,  0, 94,  0],
       [ 0, 98,  0,  0],
       [ 0,  0,  0, 98],
       [ 0, 92, 48,  0],
       [ 0,  0,  0,  0]])
Divakar
  • 218,885
  • 19
  • 262
  • 358