Difference between frompyfunc and vectorize in numpy

Question

What is the difference between vectorize and frompyfunc in numpy?

Both seem very similar. What is a typical use case for each of them?

Edit: As JoshAdel indicates, the class vectorize seems to be built upon frompyfunc. (see the source). It is still unclear to me whether frompyfunc may have any use case that is not covered by vectorize...

Any numpy developers out there who can clear this up? Numpy has many of these situations where there were higher and lower level implementations without a pointer between them in the docs. — dtlussier, Nov 24 '11 at 18:07
For some secret reason, `frompyfunc` produces functions that consciously disregard the `dtype` argument and return an array of `object`s. As the documentation explains, "The returned ufunc always returns PyObject arrays". There is an easy and ingenious workaround: submit an array of desired type as `out` argument. The `vectorize` function, on the contrary, allows to specify the output type of the ufunc with `otypes` argument, but it is supposed to be slow and hence fairly useless, compared to using nested lists. — Alexey, Mar 14 '16 at 16:32
If anyone wants to take the speed arguments further, there's [this](https://stackoverflow.com/questions/57253839/why-vectorize-is-outperformed-by-frompyfunc/74596788#74596788) Q/A, that offers 1000x speedup. — Neil_UK, Nov 28 '22 at 06:39

Stuart Berg · Answer 1 · 2013-08-21T03:12:13.490

As JoshAdel points out, vectorize wraps frompyfunc. Vectorize adds extra features:

Copies the docstring from the original function
Allows you to exclude an argument from broadcasting rules.
Returns an array of the correct dtype instead of dtype=object

Edit: After some brief benchmarking, I find that vectorize is significantly slower (~50%) than frompyfunc for large arrays. If performance is critical in your application, benchmark your use-case first.

`

>>> a = numpy.indices((3,3)).sum(0)

>>> print a, a.dtype
[[0 1 2]
 [1 2 3]
 [2 3 4]] int32

>>> def f(x,y):
    """Returns 2 times x plus y"""
    return 2*x+y

>>> f_vectorize = numpy.vectorize(f)

>>> f_frompyfunc = numpy.frompyfunc(f, 2, 1)
>>> f_vectorize.__doc__
'Returns 2 times x plus y'

>>> f_frompyfunc.__doc__
'f (vectorized)(x1, x2[, out])\n\ndynamic ufunc based on a python function'

>>> f_vectorize(a,2)
array([[ 2,  4,  6],
       [ 4,  6,  8],
       [ 6,  8, 10]])

>>> f_frompyfunc(a,2)
array([[2, 4, 6],
       [4, 6, 8],
       [6, 8, 10]], dtype=object)

`

Intersting... but the differences and use cases are still pretty much unclear to me... — Olivier Verdier, Jul 02 '12 at 18:39

JoshAdel · Answer 2 · 2011-07-20T21:24:30.690

10

I'm not sure what the different use cases for each is, but if you look at the source code (/numpy/lib/function_base.py), you'll see that vectorize wraps frompyfunc. My reading of the code is mostly that vectorize is doing proper handling of the input arguments. There might be particular instances where you would prefer one vs the other, but it would seem that frompyfunc is just a lower level instance of vectorize.

edited Jul 20 '11 at 21:24

answered Jul 20 '11 at 21:19

JoshAdel

66,734
27
141
140

3

I agree with you that `frompyfunc` seems to be lower level than `vectorize`. The question remains, though, of whether there are cases where you would prefer to use `frompyfunc` instead of `vectorize`? – Olivier Verdier Jul 21 '11 at 05:37
1

I think you cannot use `.accumulate` in `vectorize` like https://stackoverflow.com/a/27912352/3226167 – user3226167 Jun 19 '18 at 10:05

score 6 · Answer 3 · answered Dec 17 '16 at 10:25

6

Although both methods provide you a way to build your own ufunc, numpy.frompyfunc method always returns a python object, while you could specify a return type when using numpy.vectorize method

answered Dec 17 '16 at 10:25

utada219

71
1
1

Difference between frompyfunc and vectorize in numpy

3 Answers3

Linked