Applying a function along a numpy array

Question

I've the following numpy ndarray.

[ -0.54761371  17.04850603   4.86054302]

I want to apply this function to all elements of the array

def sigmoid(x):
  return 1 / (1 + math.exp(-x))

probabilities = np.apply_along_axis(sigmoid, -1, scores)

This is the error that I get.

TypeError: only length-1 arrays can be converted to Python scalars

What am I doing wrong.

Replacing `math.exp` with `np.exp` will solve the issue – kmario23 Mar 26 '17 at 04:22 — kmario23, Mar 26 '17 at 04:22

Serenity · Accepted Answer · 2017-03-26T07:18:41.023

43

Function numpy.apply_along_axis is not good for this purpose. Try to use numpy.vectorize to vectorize your function: https://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html This function defines a vectorized function which takes a nested sequence of objects or numpy arrays as inputs and returns an single or tuple of numpy array as output.

import numpy as np
import math

# custom function
def sigmoid(x):
  return 1 / (1 + math.exp(-x))

# define vectorized sigmoid
sigmoid_v = np.vectorize(sigmoid)

# test
scores = np.array([ -0.54761371,  17.04850603,   4.86054302])
print sigmoid_v(scores)

Output: [ 0.36641822 0.99999996 0.99231327]

Performance test which shows that the scipy.special.expit is the best solution to calculate logistic function and vectorized variant comes to the worst:

import numpy as np
import math
import timeit

def sigmoid_(x):
  return 1 / (1 + math.exp(-x))
sigmoidv = np.vectorize(sigmoid_)

def sigmoid(x):
   return 1 / (1 + np.exp(x))

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(100)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(100)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(100)",  number=25)

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(1000)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(1000)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(1000)",  number=25)

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(10000)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(10000)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(10000)",  number=25)

Results:

size        vectorized      numpy                 expit
N=100:   0.00179314613342 0.000460863113403 0.000132083892822
N=1000:  0.0122890472412  0.00084114074707  0.000464916229248
N=10000: 0.109477043152   0.00530695915222  0.00424313545227

edited Mar 26 '17 at 07:18

answered Mar 26 '17 at 03:48

Serenity

35,289
20
120
115

10

It is worth noting this: "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop." – juanpa.arrivillaga Mar 26 '17 at 04:00
Efficient of numpy vectorize depends on the size of the array. – Serenity Mar 26 '17 at 04:13
Well sure, but it is basically a python for-loop with extra overhead. – juanpa.arrivillaga Mar 26 '17 at 04:13
I guess these oveheads does not matter if a size of array becomes significant. I.e.: https://stackoverflow.com/questions/35215161/most-efficient-way-to-map-function-over-numpy-array – Serenity Mar 26 '17 at 04:15
That question doesn't compare using built-in vectorized operations vs `np.vectorize`. – juanpa.arrivillaga Mar 26 '17 at 04:18
OK but I talk about mapping of custom function not just a function which includes only numpy build-in function. – Serenity Mar 26 '17 at 04:34
Add if you want to compare performance of different versions of logistic function you have to try `scipy.special.expit`. I guess it one of the fastest realization for python. – Serenity Mar 26 '17 at 04:41
Sure, but in this case, you should note that using vectorized is going to be an order-of-magnitude slower. Also, in that link, the comparisons are bad. Using `np.fromiter` with a generator expression passing the `count` keyword argument is the fastest, especially for large arrays. – juanpa.arrivillaga Mar 26 '17 at 04:42
1

Ah, I was unaware of `scipy.special.expit`, that will certainly be faster! – juanpa.arrivillaga Mar 26 '17 at 04:46
`np.vectorize` uses `np.frompyfunc`. `frompyfunc` tends to be 2x faster than explicit iteration, but only returns dtype object arrays. – hpaulj Mar 26 '17 at 05:01

score 14 · Answer 2 · edited Apr 08 '20 at 14:37

Use np.exp and that will work on numpy arrays in a vectorized fashion:

>>> def sigmoid(x):
...     return 1 / (1 + np.exp(-x))
...
>>> sigmoid(scores)
array([  6.33581776e-01,   3.94391811e-08,   7.68673281e-03])
>>>

You will likely not get any faster than this. Consider:

>>> def sigmoid(x):
...     return 1 / (1 + np.exp(-x))
...

And:

>>> def sigmoidv(x):
...   return 1 / (1 + math.exp(-x))
...
>>> vsigmoid = np.vectorize(sigmoidv)

Now, to compare the timings. With a small (size 100) array:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(100)", number=100)
>>> t
0.006894525984534994
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(100)", number=100)
>>> t
0.0007238480029627681

So, still an order-of-magnitude difference with small arrays. This performance differences stays relatively constant, with a 10,000 size array:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(10000)", number=100)
>>> t
0.3823414359940216
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(10000)", number=100)
>>> t
0.011259705002885312

And finally with a size 100,000 array:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(100000)", number=100)
>>> t
3.7680041620042175
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(100000)", number=100)
>>> t
0.09544878199812956

score 3 · Answer 3 · answered Mar 26 '17 at 04:58

Just to clarify what apply_along_axis is doing, or not doing.

def sigmoid(x):
  print(x)    # show the argument
  return 1 / (1 + math.exp(-x))

In [313]: np.apply_along_axis(sigmoid, -1,np.array([ -0.54761371  ,17.04850603 ,4.86054302])) 
[ -0.54761371  17.04850603   4.86054302]   # the whole array
...
TypeError: only length-1 arrays can be converted to Python scalars

The reason you get the error is that apply_along_axis passes a whole 1d array to your function. I.e. the axis. For your 1d array this is the same as

sigmoid(np.array([ -0.54761371  ,17.04850603 ,4.86054302]))

The apply_along_axis does nothing for you.

As others noted,switching to np.exp allows sigmoid to work with the array (with or without the apply_along_axis wrapper).

score 3 · Answer 4 · answered Oct 26 '19 at 17:26

3

scipy already implements the function Luckily, Python allows us to rename things upon import:

 from scipy.special import expit as sigmoid

answered Oct 26 '19 at 17:26

Sergey V.

611
5
7

Applying a function along a numpy array

4 Answers4

Linked