Vectorise sequential numpy calculations

Question

I have a number of operations in numpy which I can perfectly perform in a loop, but I haven't been able to vectorise them in one numpy call.

# data matrix
d = np.random.rand(1496, 1, 2)

# boolean matrix
r = np.random.rand(5, 1496, 1, 2) > 0.5

# result matrix
x = np.empty((5,))

# How can I avoid this loop?
for i in xrange(r.shape[0]):
  x[i] = d[r[i]].sum()

Is it possible to speed things up by somehow vectorising the loop?

AGN Gazer · Answer 1 · 2017-07-10T04:43:52.217

A variation of the @Psidom approach:

np.tensordot(d, r, axes=((0,1,2), (1,2,3)))

This relies on tensordot function. From its documentation:

Given two tensors (arrays of dimension greater than or equal to one), a and b, and an array_like object containing two array_like objects, (a_axes, b_axes), sum the products of a's and b's elements (components) over the axes specified by a_axes and b_axes.

Fundamentally, tensordot computes the product of two arrays (just like @Psidom did) and computes the sum of all elements of the product array. The only "advantage" of this method over @Psidom's approach is that it allows more flexibility in specifying over which axes to perform product and summation in both arrays. It does not offer improved performance over @Psidom's method.

Also, see Understanding tensordot.

Interesting approach. Could you explain it a bit? Would this be faster than @Psidom's suggestion? — orange, Jul 10 '17 at 03:58

Psidom · Accepted Answer · 2017-07-10T02:54:30.247

1

You can vectorize it as this; since d is one dimension less than r, when they are multiplied, d will be broadcasted along axis=0 of r and so avoid the loop; And also since r is a boolean array, d[r].sum() will be the same as (d*r).sum:

(d * r).sum(axis=(1,2,3))
# array([ 775.17049697,  728.61537246,  735.05686655,  765.19469927,
#         759.44834287])

The result is the same as x:

((d*r).sum(axis=(1,2,3)) == x).all()
# True

edited Jul 10 '17 at 02:54

answered Jul 10 '17 at 02:49

Psidom

209,562
33
339
356

Nice idea to multiply the float and boolean matrix and use broadcasting (also thanks for the explanation). – orange Jul 10 '17 at 02:54
No problems. Glad it helps. – Psidom Jul 10 '17 at 02:55

Vectorise sequential numpy calculations

2 Answers2