14

I'm trying to using numpy.lib.stride_tricks.as_strided to iterate over non-overlapping blocks of an array, but I'm having trouble finding documentation of the parameters, so I've only been able to get overlapping blocks.

For example, I have a 4x5 array which I'd like to get 4 2x2 blocks from. I'm fine with the extra cells on the right and bottom edge being excluded.

So far, my code is:

import sys
import numpy as np

a = np.array([
[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
])

sz = a.itemsize
h,w = a.shape
bh,bw = 2,2

shape = (h/bh, w/bw, bh, bw)
strides = (w*sz, sz, w*sz, sz)
blocks = np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

print blocks[0][0]
assert blocks[0][0].tolist() == [[1, 2], [6,7]]
print blocks[0][1]
assert blocks[0][1].tolist() == [[3,4], [8,9]]
print blocks[1][0]
assert blocks[1][0].tolist() == [[11, 12], [16, 17]]

The shape of the resulting blocks array seems to be correct, but the last two asserts fail, presumably because my shape or strides parameters are incorrect. What values for these should I set to get non-overlapping blocks?

Cerin
  • 60,957
  • 96
  • 316
  • 522

3 Answers3

18
import numpy as np
n=4
m=5
a = np.arange(1,n*m+1).reshape(n,m)
print(a)
# [[ 1  2  3  4  5]
#  [ 6  7  8  9 10]
#  [11 12 13 14 15]
#  [16 17 18 19 20]]
sz = a.itemsize
h,w = a.shape
bh,bw = 2,2
shape = (h/bh, w/bw, bh, bw)
print(shape)
# (2, 2, 2, 2)

strides = sz*np.array([w*bh,bw,w,1])
print(strides)
# [40  8 20  4]

blocks=np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
print(blocks)
# [[[[ 1  2]
#    [ 6  7]]
#   [[ 3  4]
#    [ 8  9]]]
#  [[[11 12]
#    [16 17]]
#   [[13 14]
#    [18 19]]]]

Starting at the 1 in a (i.e. blocks[0,0,0,0]), to get to the 2 (i.e. blocks[0,0,0,1]) is one item away. Since (on my machine) the a.itemsize is 4 bytes, the stride is 1*4 = 4. This gives us the last value in strides = (10,2,5,1)*a.itemsize = (40,8,20,4).

Starting at the 1 again, to get to the 6 (i.e. blocks[0,0,1,0]), is 5 (i.e. w) items away, so the stride is 5*4 = 20. This accounts for the second to last value in strides.

Starting at the 1 yet again, to get to the 3 (i.e. blocks[0,1,0,0]), is 2 (i.e. bw) items away, so the stride is 2*4 = 8. This accounts for the second value in strides.

Finally, starting at the 1, to get to 11 (i.e. blocks[1,0,0,0]), is 10 (i.e. w*bh) items away, so the stride is 10*4 = 40. So strides = (40,8,20,4).

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    Thanks. It looks like our a.itemsize differs (mine is 8). I refactored your code to use a formula (based on your explanation) to define strides, so it'll work for everyone. – Cerin Nov 09 '11 at 22:00
  • @unutbu, when running your code, I got error message like TypeError: 'float' object cannot be interpreted as an integer. – user288609 Dec 12 '19 at 23:11
  • in the example `shape` includes floats due to the divisions -> `h/bh` and `w/bw`. `int` both and you should be fine. – neonwatty Feb 11 '23 at 18:08
8

Using @unutbu's answer as an example, I wrote a function that implements this tiling trick for any ND array. See below for link to source.

>>> a = numpy.arange(1,21).reshape(4,5)

>>> print a
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]]

>>> blocks = blockwise_view(a, blockshape=(2,2), require_aligned_blocks=False)

>>> print blocks
[[[[ 1 2]
   [ 6 7]]

  [[ 3 4]
   [ 8 9]]]


 [[[11 12]
   [16 17]]

  [[13 14]
   [18 19]]]]

[blockwise_view.py] [test_blockwise_view.py]

Stuart Berg
  • 17,026
  • 12
  • 67
  • 99
2

scikit-image has a function named view_as_blocks() that does almost what you need. The only problem is that it has an extra assert that forbids your use case, since your blocks don't divide evenly into your array shape. But in your case, the assert isn't necessary, so you can copy the function source code and safely remove the pesky assert yourself.

Stuart Berg
  • 17,026
  • 12
  • 67
  • 99