5

I have data array, with shape 100x100. I want to divide it into 5x5 blocks, and each block has 20x20 grids. The value of each block I want is the sum of all values in it.

Is there a more elegant way to accomplish it?

x = np.arange(100)
y = np.arange(100)
X, Y = np.meshgrid(x, y)
Z = np.cos(X)*np.sin(Y)
Z_new = np.zeros((5, 5))
for i in range(5):
  for j in range(5):
    Z_new[i, j] = np.sum(Z[i*20:20+i*20, j*20:20+j*20])

This is based on index, how if based on x?

x = np.linspace(0, 1, 100)
y = np.linspace(0, 1, 100)
X, Y = np.meshgrid(x, y)
Z = np.cos(X)*np.sin(Y)
x_new = np.linspace(0, 1, 15)
y_new = np.linspace(0, 1, 15)

Z_new?

Aristotle0
  • 317
  • 2
  • 9
  • For solutions in any number of dimensions: http://stackoverflow.com/questions/36269508/lets-make-a-reference-implementation-of-n-dimensional-pixel-binning-bucketing-f/36269734#36269734 – Alex Riley Apr 03 '16 at 08:29
  • @ajcr I am reopening this because the duplicate linked question would involve considerable overhead of setting up for a generic ndarray case, which isn't needed to solve a case like this. Hope this sounds okay. – Divakar Apr 29 '16 at 14:04
  • Possible duplicate of [Blockwise operations in Numpy](https://stackoverflow.com/questions/34325176/blockwise-operations-in-numpy) – norok2 Oct 22 '19 at 07:20

3 Answers3

5

Simply reshape splitting each of those two axes into two each with shape (5,20) to form a 4D array and then sum reduce along the axes having the lengths 20, like so -

Z_new = Z.reshape(5,20,5,20).sum(axis=(1,3))

Functionally the same, but potentially faster option with np.einsum -

Z_new = np.einsum('ijkl->ik',Z.reshape(5,20,5,20))

Generic block size

Extending to a generic case -

H,W = 5,5 # block-size
m,n = Z.shape
Z_new = Z.reshape(H,m//H,W,n//W).sum(axis=(1,3))

With einsum that becomes -

Z_new = np.einsum('ijkl->ik',Z.reshape(H,m//H,W,n//W))

To compute average/mean across blocks, use mean instead of sum method.

Generic block size and reduction operation

Extending to use reduction operations that have ufuncs supporting multiple axes parameter with axis for reductions, it would be -

def blockwise_reduction(a, height, width, reduction_func=np.sum):
    m,n = a.shape
    a4D = a.reshape(height,m//height,width,n//width)
    return reduction_func(a4D,axis=(1,3))

Thus, to solve our specific case, it would be :

blockwise_reduction(Z, height=5, width=5)

and for a block-wise average computation, it would be -

blockwise_reduction(Z, height=5, width=5, reduction_func=np.mean)
Divakar
  • 218,885
  • 19
  • 262
  • 358
0

You can do following.

t = np.eye(5).repeat(20, axis=1)
Z_new = t.dot(Z).dot(t.T)

This is correct because Z_new[i, j] = t[i, k] * Z[k, l] * t[j, l]

Also this seems faster than Divakar's solution.

rnbguy
  • 1,369
  • 1
  • 10
  • 28
0

Such a problem is a very good candidate for a function like scipy.ndimage.measurements.sum since it allows "grouping" and "labelling" terms. You will have what you want with something like:

labels = [[20*(y//5) + x//5 for x in range(100)] for y in range(100)]
s = scipy.ndimage.measurements.sum(Z, labels, range(400))

(Not tested, but that is the idea).

Thomas Baruchel
  • 7,236
  • 2
  • 27
  • 46