How to reduce the dimensions of a numpy array by using the sum over n elements?

Question

I have the following array (numbers are placeholders for illustration):

arr = np.array([[1,  1,  1,    2,  2,  2,    3,  3,  3,    4  ,4,  4 ],
                [1,  1,  1,    2,  2,  2,    3,  3,  3,    4,  4,  4 ],
                [1,  1,  1,    2,  2,  2,    3,  3,  3,    4,  4,  4 ],

                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],
                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],
                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],

                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],
                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],
                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],

                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16],
                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16],
                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16]])

I would like to reduce the dimensions in a way that every 9 elements (3x3 area) having the same numer here would be summed up. So the 12*12 array should become a 4x4 array.

I was looking here for other answers and have found something for a 1D array I adapted. Hoewever, it is not working as expected:

result = np.sum(arr.reshape(-1,3), axis=1)
result = np.sum(result .reshape(3,-1), axis=0)

What is the correct was to achieve the desrired result?

The question linked by @MykolaZotko contains the partial answer, but for this case I think we need strides, as included in my answer below. — Felix, Nov 03 '20 at 12:33
The question asks about something else. They are linked, in terms of how they can be solved, but they ask different things @MykolaZotko — yatu, Nov 03 '20 at 12:43

Felix · Answer 1 · 2020-11-03T12:53:02.740

I suggest following Nils' answer, as it is simpler and more efficient for this particular case, although what I suggest below is more general, if you wanted something else than just a sum.

You are looking for convolution. A small kernel is run over the array, performing element-wise multiplications and summing the results, generating values at each step for a new array. In this case we want a simple sum, so we'll use a kernel of ones with the appropriate size (3x3). Because we want no overlap, our stride is also 3 in both directions.

2D convolution is not available in NumPy, so we'll have to import from SciPy. But that function doesn't have stride (skipping) functionality, so we'll implement our own manually.

from scipy.signal import convolve2d

kernel = np.ones((3, 3))
convolved = convolve2d(arr, kernel, mode='valid')
strided = convolved[::3, ::3]

strided contains the result here, and we can check the final result by dividing by nine, to get the original value of each cell.

>>> strided / 9
array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [13., 14., 15., 16.]])

Nils Werner · Accepted Answer · 2020-11-03T12:50:45.827

If we look at the flattened array

arr.ravel()
# array([ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4,  1,  1,  1,  2,  2,
#         2,  3,  3,  3,  4,  4,  4,  1,  1,  1,  2,  2,  2,  3,  3,  3,  4,
#         4,  4,  5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8,  5,  5,  5,
#         6,  6,  6,  7,  7,  7,  8,  8,  8,  5,  5,  5,  6,  6,  6,  7,  7,
#         7,  8,  8,  8,  9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12,  9,
#         9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12,  9,  9,  9, 10, 10, 10,
#        11, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16,
#        16, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16, 16, 13, 13, 13, 14,
#        14, 14, 15, 15, 15, 16, 16, 16])

We can see a pattern

Groups of 3 digits
in groups of 4
in super-groups of 3

Use that to reshape your array (from back to front), and take the sum

arr.reshape(-1, 3, 4, 3).sum((-1, -3))
# array([[  9,  18,  27,  36],
#        [ 45,  54,  63,  72],
#        [ 81,  90,  99, 108],
#        [117, 126, 135, 144]])

Maybe that 4 can be obtained knowing the block size? `arr.shape[0]//3`? — yatu, Nov 03 '20 at 12:50

score 1 · Answer 3 · answered Nov 03 '20 at 12:49

after a bit fiddling with reshape, I came up with this


arr = np.array([[ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12]])

a = np.size(arr,0)//3
b = np.size(arr,1)//3

np.sum(arr.reshape(a, 3, b, 3), axis=(1,3))

# result

array([[  9,  18,  27,  36],
       [ 45,  54,  63,  72],
       [ 81,  90,  99, 108]])

yatu · Answer 4 · 2020-11-03T12:40:40.263

0

We can use skimage's view_as_blocks to take a strided view of the array and then take the sum of each block:

from skimage.util.shape import view_as_blocks
n = 3
view_as_blocks(arr, (n,n)).sum((-1,2))
array([[  9,  18,  27,  36],
       [ 45,  54,  63,  72],
       [ 81,  90,  99, 108],
       [117, 126, 135, 144]])

edited Nov 03 '20 at 12:40

answered Nov 03 '20 at 12:37

yatu

86,083
12
84
139

Why divide by `n ** 2`? If you want the average, just use `.mean()`. But OP wants to sum, so no normalization necessary. Also you can do `.sum(-1, -2)` – Nils Werner Nov 03 '20 at 12:39
True, was assuming `mean` for some reason, updated @nils thx – yatu Nov 03 '20 at 12:41
I think you need a tuple to concatenate axes along which to reduce @nils – yatu Nov 03 '20 at 12:41

How to reduce the dimensions of a numpy array by using the sum over n elements?

4 Answers4