0

I try to build samples of m vectors (with integer entries) together with m evaluations. A vector x of shape (n,1) is evaluated to y=1 if one of its entries is the number 2. Otherwise, it is evaluated as y=0.
In order to deal with many such vectors and evaluations, the sample vectors are stored in an (n,m)-shaped ndarray and the evaluations are stored in a (1,m)-shaped ndarray. See the code:

import numpy as np

n = 10 # number of entries in each sample vector
m = 1000 # number of samples

X = np.random.randint(-10, 10, (n, m))
Y = []
for i in range(m):
    if 2 in X[:, i]:
        Y.append(1)
    else:
        Y.append(0)
Y = np.array(Y).reshape((1,-1))
assert (Y.shape == (1,m))

How can I vectorize the computation of Y? I tried to replace the initialization/computation of X and Y by the following:

X = np.random.randint(-10,10,(n,m))
Y = np.apply_along_axis(func1d=lambda x: 1 if 2 in x else 0, axis=0, arr=X)

A few executions suggested that this is most times even a bit slower than my first approach. (Acutally this anser starts by saying that numpy.apply_along_axis was not for speed. Also I am not aware of how good lambda is in this context.)

Is there a way to vectorize the computation of Y, i.e. a way to assign a value 1 or 0 to each column, depending on whether that column contains the element 2?

NerdOnTour
  • 634
  • 4
  • 15

1 Answers1

1

When using Numpy array and logical statement, it does a lot of optimisations without the user having to manually vectorise tasks. The following code reaches the same solution:

# assign logical 1 where element == 2 everywhere in the array X,
# then, for each column (axis = 0), if any element == 1 assign column logical 1
Y = (X == 2).any(axis = 0).reshape(1, -1)
print(Y.shape)

using timeit to assess execution times:

loop method: 3240 microseconds per run

numpy method: 6.57 microseconds per run

If you're interested, you could see if using other vectorisation methods, such as np.vectorise, improves the time further though I'm quite sure the underlying Numpy optimisations perform their own vectorisation at CPU instruction level (SIMD) by default.

Bottom line is when using numpy always try to find a solution using logical arrays and numpy functions/methods as they're already very heavily optimised within the compiled binaries, and any python functions used to manipulate, access, or iterate the data slows the execution speed dramatically.

By the way, the most common way to get faster for loop execution to build a list of outputs such as you've done is to use list comprehension:

Y = np.array([2 in X[:, i] for i in range(m)]).reshape((1, -1))

which executes in 3070 microseconds per loop.

G.S
  • 535
  • 3
  • 8
  • Most numpy functions provide an `axis` argument to apply an operation along a specific axis (or axes). Since the operation here is something trivial, like checking for an equality + a `np.any` call, it's best to leave it to numpy itself, like this answer shows. Mind you, this result is a boolean array, but you can easily convert to your desired format with `.astype(int)`. `np.vectorize` is [effectively a for loop](https://stackoverflow.com/questions/40955550/numpy-vectorize-why-so-slow) and should be used when the operation you want is so complex that numpy doesn't provide it by default. – Reti43 Nov 25 '21 at 13:41
  • FYI: `(X == 2).any(axis = 0).reshape(1, -1)` can also be written `(X == 2).any(axis=0, keepdims=True)` – Warren Weckesser Nov 25 '21 at 13:46
  • `numpy` "vectorization` has little to do with SIMD. The big difference is between performing loops in interpreted Python, and moving them to compiled `numpy` methods. Your `(X == 2).any(axis = 0)` is doing (2) whole-array operations without python level loops. `numpy` doesn't provide any functions that can compile user provided functions. – hpaulj Nov 25 '21 at 16:23
  • Thanks for the clarification on the expected performance and behaviour of np.vectorise – G.S Nov 25 '21 at 16:52