Your function and arrays:
In [222]: def my_func(a: np.ndarray, b: np.ndarray) -> float:
...: return np.nanmin(a, axis=0) + np.nanmin(b, axis=0)
...:
In [223]: a = np.array([np.array([1., 2]), np.array([1, 2., 3, np.nan])], dtype=object
...: ) # First array shape (2,), second (3,)
...: b = np.array([np.array([1]), np.array([1.5, 2.5, np.nan])], dtype=object)
In [224]: a
Out[224]: array([array([1., 2.]), array([ 1., 2., 3., nan])], dtype=object)
In [225]: b
Out[225]: array([array([1]), array([1.5, 2.5, nan])], dtype=object)
Compare vectorize
with a straightforward list comprehension:
In [226]: np.vectorize(my_func)(a, b)
Out[226]: array([2. , 2.5])
In [227]: [my_func(i,j) for i,j in zip(a,b)]
Out[227]: [2.0, 2.5]
and their times:
In [228]: timeit np.vectorize(my_func)(a, b)
157 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [229]: timeit [my_func(i,j) for i,j in zip(a,b)]
85.9 µs ± 148 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [230]: timeit np.array([my_func(i,j) for i,j in zip(a,b)])
89.7 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
If you are going to work with object arrays, frompyfunc
is faster than vectorize
:
In [231]: np.frompyfunc(my_func,2,1)(a, b)
Out[231]: array([2.0, 2.5], dtype=object)
In [232]: timeit np.frompyfunc(my_func,2,1)(a, b)
83.2 µs ± 50.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I'm a bit surprised that it's even better than the list comprehension.
frompyfunc
(and vectorize
) are more useful when the inputs need to 'broadcast' against each other:
In [233]: np.frompyfunc(my_func,2,1)(a[:,None], b)
Out[233]:
array([[2.0, 2.5],
[2.0, 2.5]], dtype=object)
I'm not a numba
expert, but I suspect it doesn't handle object dtype arrays, or it it does it doesn't improve speed much. Remember, object dtype means the elements are object references, just like in lists.
I get better times by using otypes
and taking the function creation out of the timing loop:
In [235]: %%timeit f=np.vectorize(my_func, otypes=[float])
...: f(a, b)
...:
...:
95.5 µs ± 316 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [236]: %%timeit f=np.frompyfunc(my_func,2,1)
...: f(a, b)
...:
...:
81.1 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
If you don't know about otypes
, you haven't read the np.vectorize
docs well enough.