I try to build samples of m
vectors (with integer entries) together with m
evaluations. A vector x
of shape (n,1)
is evaluated to y=1
if one of its entries is the number 2
. Otherwise, it is evaluated as y=0
.
In order to deal with many such vectors and evaluations, the sample vectors are stored in an (n,m)
-shaped ndarray
and the evaluations are stored in a (1,m)
-shaped ndarray
. See the code:
import numpy as np
n = 10 # number of entries in each sample vector
m = 1000 # number of samples
X = np.random.randint(-10, 10, (n, m))
Y = []
for i in range(m):
if 2 in X[:, i]:
Y.append(1)
else:
Y.append(0)
Y = np.array(Y).reshape((1,-1))
assert (Y.shape == (1,m))
How can I vectorize the computation of Y
? I tried to replace the initialization/computation of X
and Y
by the following:
X = np.random.randint(-10,10,(n,m))
Y = np.apply_along_axis(func1d=lambda x: 1 if 2 in x else 0, axis=0, arr=X)
A few executions suggested that this is most times even a bit slower than my first approach. (Acutally this anser starts by saying that numpy.apply_along_axis
was not for speed. Also I am not aware of how good lambda
is in this context.)
Is there a way to vectorize the computation of Y
, i.e. a way to assign a value 1
or 0
to each column, depending on whether that column contains the element 2
?