6

Say I have a function foo() that takes in a single float and returns a single float. What's the fastest/most pythonic way to apply this function to every element in a numpy matrix or array?

What I essentially need is a version of this code that doesn't use a loop:

import numpy as np

big_matrix = np.matrix(np.ones((1000, 1000)))

for i in xrange(np.shape(big_matrix)[0]):

    for j in xrange(np.shape(big_matrix)[1]):

        big_matrix[i, j] = foo(big_matrix[i, j])

I was trying to find something in the numpy documentation that will allow me to do this but I haven't found anything.

Edit: As I mentioned in the comments, specifically the function I need to work with is the sigmoid function, f(z) = 1 / (1 + exp(-z)).

ClydeTheGhost
  • 1,473
  • 2
  • 17
  • 31
  • Agree that vectorization is the answer. You will want to re-think what function foo() does, currently it works on individual elements. Vectorization means operating on entire rows/columns at once thus removing the loop. – chill_turner Jul 26 '16 at 18:50
  • 1
    `np.vectorize` is definitely the most "pythonic" in the general case. However, for certain functions `foo`, you might be able to do better by not using a function at all and relying on `numpy` vector operations (since `np.vectorize` doesn't really do anything to make the calculation more performant). – mgilson Jul 26 '16 at 18:51
  • I'm specifically looking to use the sigmoid function, i.e. `f(z) = 1 / (1 + exp(-z))` – ClydeTheGhost Jul 26 '16 at 19:12
  • 3
    replace `exp(-z)` with `np.exp(-z)` and your function will be vectorized. Then it can take an np.array as input and return the correct answer. – Dr K Jul 26 '16 at 19:22
  • http://stackoverflow.com/questions/7701429/efficient-evaluation-of-a-function-at-every-cell-of-a-numpy-array is not a good duplicate. All it proposes is `np.vectorize` which loops. There are better answers in the comments. – hpaulj Jul 26 '16 at 20:52

1 Answers1

4

If foo is really a black box that takes a scalar, and returns a scalar, then you must use some sort of iteration. People often try np.vectorize and realize that, as documented, it does not speed things up much. It is most valuable as a way of broadcasting several inputs. It uses np.frompyfunc, which is slightly faster, but with a less convenient interface.

The proper numpy way is to change your function so it works with arrays. That shouldn't be hard to do with the function in your comments

f(z) = 1 / (1 + exp(-z))

There's a np.exp function. The rest is simple math.

hpaulj
  • 221,503
  • 14
  • 230
  • 353