I'm not sure this applies in your case, but vectorize
has a few tricks.
If you do not specify the return dtype
, it determines it with an test calculation - with your first case. If your function returns a scalar integer, like 0, then vectorize
returns an integer array. So if you expect a floats, make sure you specify the return dtype
.
Also - vectorize
is not a speed tool. It's just a convenient way of applying broadcasting to your inputs. It isn't much faster than explicitly looping on your inputs.
np.vectorize(fun, otypes=[float])
removes the steps.
===========
Try this:
vfun = np.vectorize(fun, otypes=[float])
X = vfun(interval[:,None], pivots, truths)
print(X.shape) # (300,3)
y2 = np.dot(X, coeffs)
print(y2.shape) # (300,)
It makes fuller use of the vectorize's
broadcasting.
I suspect your fun
can be written so as to act on the whole x
, without the iteration that vectorize
does.
Changing fun
to use the np.maximum
, allows me to supply an array x
:
def fun(x, pivot, truth):
if truth: return np.maximum(0, x - pivot)
else: return np.maximum(0, pivot - x)
And I can then calculate X
with only a loop over the 3 cases of pivots
and truths
, calculating all interval
values at once:
X = np.stack([fun(interval, p, t) for p, t in zip(pivots, truths)], axis=-1)
y2 = np.dot(X, coeffs)
another way of applying the 3 'cases'
Xlist = [fun(interval, p, t)*c for p, t, c in zip(pivots, truths, coeffs)]
y2 = np.sum(Xlist, axis=0)
Since the np.dot(..., coeffs)
is just a weighted sum. I'm not sure it's better.