Efficiently using Numpy to assign function values to array

Question

I am interested in finding the fastest way of carrying a simple operation in Python3.6 using Numpy. I wish to create a function and from a given array to an array of function values. Here is a simplified code that does that using map:

import numpy as np
def func(x):
    return x**2
xRange = np.arange(0,1,0.01)
arr_func = np.array(list(map(func, xRange)))

However, as I am running it with a complicated function and using large arrays, runtime speed is very important for me. Is there a known faster way?

EDIT My question is not the same as this one, because I am asking about assigning from a function, as opposed to a generator.

The actual implementation would involve specific optimizations. So, without seeing it, there's no magical way for generic cases. — Divakar, Aug 16 '17 at 08:10
Thank you @Divakar I am indeed looking for a a faster way to deal with generic cases. — splinter, Aug 16 '17 at 08:12
Why xRange and pRange? In this particular case, the **2 operation is aleady vectorized so you're incurring in a penalty by doing the map instead of just doing `arr_func = func(xRange)`. In general cases, you have to try and exploit as much as you can vectorized operations. — Ignacio Vergara Kausel, Aug 16 '17 at 08:14
Thanks @IgnacioVergaraKausel, the `pRange` was an error in pasting. I removed it. — splinter, Aug 16 '17 at 08:17
Just to add, if you just did `func(xRange)` i get a 44.8 micro second while your map to list to array takes 33.4 milli seconds (for an array of 100000 random elements). — Ignacio Vergara Kausel, Aug 16 '17 at 08:18
Possible duplicate of [How do I build a numpy array from a generator?](https://stackoverflow.com/questions/367565/how-do-i-build-a-numpy-array-from-a-generator) — orip, Aug 16 '17 at 08:21
`fromiter`` and `frompyfunc` are possible alternatives. You may have to do some timings of your own. The time spent running your function might dominate any iteration and collection mechanism. — hpaulj, Aug 16 '17 at 12:04

score 2 · Accepted Answer · edited Aug 16 '17 at 12:18

2

Check the related How do I build a numpy array from a generator?, where the most compelling option seems to be preallocating the numpy array and setting values, instead of creating a throwaway intermediate list.

arr_func = np.empty(len(xRange))
for i in range(len(xRange)):
  arr_func[i] = func(xRange[i])

edited Aug 16 '17 at 12:18

splinter

3,727
8
37
82

answered Aug 16 '17 at 08:23

orip

73,323
21
116
148

hpaulj · Answer 2 · 2017-08-16T17:03:58.987

With a complex function that can't be rewritten with compiled numpy functions, we can't make big improvements in speed.

Define a function with math methods that require scalars, for example:

def func(x):
    return math.sin(x)**2 + math.cos(x)**2

In [868]: x = np.linspace(0,np.pi,10000)

For reference do a straight forward list comprehension:

In [869]: np.array([func(i) for i in x])
Out[869]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])

In [870]: timeit np.array([func(i) for i in x])
13.4 ms ± 211 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Your list map is slightly faster:

In [871]: timeit np.array(list(map(func, x)))
12.6 ms ± 12.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

For 1d array like this, np.array can be replaced with np.fromiter. It works with a generator as well, including the Py3 map.

In [875]: timeit np.fromiter(map(func, x),float)
13.1 ms ± 176 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So that could get around the possible time penalty of creating a whole list first. But in this case it doesn't help.

Another iterator is np.frompyfunc. It is used by np.vectorize, but usually is faster with less overhead. It returns a dtype object array:

In [876]: f = np.frompyfunc(func, 1, 1)
In [877]: f(x)
Out[877]: array([1.0, 1.0, 1.0, ..., 1.0, 1.0, 1.0], dtype=object)
In [878]: timeit f(x)
11.1 ms ± 298 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [879]: timeit f(x).astype(float)
11.2 ms ± 85.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

A slight speed improvement. I noticed more of an improvement with 1000 item x. This is even better if your problem requires several arrays that may be broadcasted against each other.

Assigning to a preallocated out array may save memory, and is often recommended as a alternative to the list append iteration. But here it doesn't not give a speed improvement:

In [882]: %%timeit 
     ...: out = np.empty_like(x)
     ...: for i,j in enumerate(x): out[i]=func(j)
16.1 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

(the use of enumerate is slightly faster than range iteration).

Efficiently using Numpy to assign function values to array

2 Answers2