I have a 512x512 image array and I want to perform operations on 8x8 blocks. At the moment I have something like this:
output = np.zeros(512, 512)
for i in range(0, 512, 8):
for j in rangerange(0, 512, 8):
a = input[i:i+8, j:j+8]
b = some_other_array[i:i+8, j:j+8]
output[i:i+8, j:j+8] = np.dot(a, b)
where a
& b
are 8x8 blocks derived from the original array. I would like to speed up this code by using vectorised operations. I have reshaped my inputs like this:
input = input.reshape(64, 8, 64, 8)
some_other_array = some_other_array.reshape(64, 8, 64, 8)
How could I perform a dot product on only axes 1
& 3
to output an array of shape (64, 8, 64, 8)
?
I have tried np.tensordot(input, some_other_array, axes=([0, 1], [2, 3]))
which gives the correct output shape, but the values do not match the output from the loop above. I've also looked at np.einsum
but I haven't come across a simple example with what I'm trying to achieve.