3

I have a 2n x 2m numpy array. I would like to form a n x m array by selecting randomly one element in 2 x 2 non-overlapping sub-arrays that partition my initial array. What would be the best way to do so? Is there a way to avoid two for loops (one along each dimension)?

For example, if my array is

1 2 3 4
5 6 7 8
9 0 1 2
8 5 7 0

then, there are four 2 x 2 sub-arrays that partition it:

1 2    3 4
5 6    7 8

9 0    1 2
8 5    7 0

and I would like to pick up randomly one element in each of them to form new arrays, such as

5 3  ,  6 8  ,  2 3
9 2     9 1     0 0  .

Thank you for your time.

Rocky Li
  • 5,641
  • 2
  • 17
  • 33

2 Answers2

2

This can be done by sampling. Instead of sampling each 2x2 square, we sample the entire ndarray into 4 separate ndarray, where the same index within those sub-arrays will point within the same 2x2 square. And then we randomly sample from those 4 separate ndarray:

# create test dataset
test = np.arange(36).reshape(6,6)

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

# Create subsamples from ndarray
samples = np.array([test[::2, ::2], test[1::2, 1::2], test[::2, 1::2], test[1::2, ::2]])
>>> samples
array([[[ 0,  2,  4],
        [12, 14, 16],
        [24, 26, 28]],

       [[ 7,  9, 11],
        [19, 21, 23],
        [31, 33, 35]],

       [[ 1,  3,  5],
        [13, 15, 17],
        [25, 27, 29]],

       [[ 6,  8, 10],
        [18, 20, 22],
        [30, 32, 34]]])

Now the same index of each of these 4 subsamples point to the same 2x2 square on the original ndarray. We just need to select from the same index randomly:

# Random choice sampling between these 4 subsamples.
select = np.random.randint(4,size=(3,3))
>>> select
array([[2, 2, 1],
       [3, 1, 1],
       [3, 0, 0]])

result = select.choose(samples)
>>> result

array([[ 1,  3, 11],
       [18, 21, 23],
       [30, 26, 28]])
Rocky Li
  • 5,641
  • 2
  • 17
  • 33
0

I got blockshaped function from another answer. This answer assumes that size of your original array is appropriate for the operation.

import numpy as np

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))


arr = np.array([[1,2,3,4],[5,6,7,8],[9,0,1,2],[8,5,7,0]])

#  arr is an 2d array with dimension mxn
m = arr.shape[0]
n = arr.shape[1]

#  define blocksize
block_size = 2

#  divide into sub 2x2 arrays
#  blocks is a (Nx2x2) array
blocks = blockshaped(arr, block_size, block_size)

#  select random elements from each block to form new array
num_blocks = block_size**2
new_arr = blocks[np.arange(num_blocks), np.random.randint(low=0, high=2, size=num_blocks), np.random.randint(low=0, high=2,size=num_blocks)]

print("original array:")
print(arr)

print("random pooled array:")
print(new_arr)
unlut
  • 3,525
  • 2
  • 14
  • 23