10

Let's say we have a particularly simple function like

import scipy as sp
def func(x, y):
   return x + y

This function evidently works for several builtin python datatypes of x and y like string, list, int, float, array, etc. Since we are particularly interested in arrays, we consider two arrays:

x = sp.array([-2, -1, 0, 1, 2])
y = sp.array([-2, -1, 0, 1, 2])

xx = x[:, sp.newaxis]
yy = y[sp.newaxis, :]

>>> func(xx, yy)

this returns

array([[-4, -3, -2, -1,  0],
  [-3, -2, -1,  0,  1],
  [-2, -1,  0,  1,  2],
  [-1,  0,  1,  2,  3],
  [ 0,  1,  2,  3,  4]])

just as we would expect.

Now what if one wants to throw in arrays as the inputs for the following function?

def func2(x, y):
  if x > y:
     return x + y
  else:
     return x - y

doing >>>func(xx, yy) would raise an error.

The first obvious method that one would come up with is the sp.vectorize function in scipy/numpy. This method, nevertheless has been proved to be not very efficient. Can anyone think of a more robust way of broadcasting any function in general on to numpy arrays?

If re-writing the code in an array-friendly fashion is the only way, it would help if you could mention it here too.

Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
bhanukiran
  • 166
  • 1
  • 2
  • 7

3 Answers3

16

np.vectorize is a general way to convert Python functions that operate on numbers into numpy functions that operate on ndarrays.

However, as you point out, it isn't very fast, since it is using a Python loop "under the hood".

To achieve better speed, you have to hand-craft a function that expects numpy arrays as input and takes advantage of that numpy-ness:

import numpy as np

def func2(x, y):
    return np.where(x>y,x+y,x-y)      

x = np.array([-2, -1, 0, 1, 2])
y = np.array([-2, -1, 0, 1, 2])

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

print(func2(xx, yy))
# [[ 0 -1 -2 -3 -4]
#  [-3  0 -1 -2 -3]
#  [-2 -1  0 -1 -2]
#  [-1  0  1  0 -1]
#  [ 0  1  2  3  0]]

Regarding performance:

test.py:

import numpy as np

def func2a(x, y):
    return np.where(x>y,x+y,x-y)      

def func2b(x, y):
    ind=x>y
    z=np.empty(ind.shape,dtype=x.dtype)
    z[ind]=(x+y)[ind]
    z[~ind]=(x-y)[~ind]
    return z

def func2c(x, y):
    # x, y= x[:, None], y[None, :]
    A, L= x+ y, x<= y
    A[L]= (x- y)[L]
    return A

N=40
x = np.random.random(N)
y = np.random.random(N)

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

Running:

With N=30:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
1000 loops, best of 3: 219 usec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
1000 loops, best of 3: 488 usec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
1000 loops, best of 3: 248 usec per loop

With N=1000:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
10 loops, best of 3: 93.7 msec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
10 loops, best of 3: 367 msec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
10 loops, best of 3: 186 msec per loop

This seems to suggest that func2a is slightly faster than func2c (and func2b is horribly slow).

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    np.where can be very useful for stuff like this, too. – matt Mar 04 '11 at 20:07
  • @unutbu: With `where` it looks definitely nice, but have you consider also the implications to performance when implementing with `where`? Thanks – eat Mar 04 '11 at 20:41
  • @unutbu: Interesting timings. Care to time with N like in the range 1e3... 1e4? So far implementation with `where` seems to be most reasonable one. Thanks – eat Mar 04 '11 at 21:27
  • @eat: When I unthinkingly set `N=10000` I made my poor little computer hang :). When `x` has shape (10000,), the return value of `func2*` has shape (10000,10000). With `dtype=`float64`, that requires at least 760 MiB. This took me into the realm of buffer swapping. Anyway, I'm leaning towards believing the ordering of the results would not change even as N grows. Do you think it would? – unutbu Mar 04 '11 at 21:53
  • @unutbu: Late already, but I'll do some timings myself tomorrow (but you can still try with 1e3 level). Well, I don't know whether the ordering will change any dramatical way, but with my limited experience, there are cases when `where` brings additional overhead compared to plain `logical indexing`. Thanks – eat Mar 04 '11 at 22:15
  • @unutbu: I made some timings and they are similar than yours. So the implementation with `where` seems to be a clear winner. Thanks – eat Mar 05 '11 at 07:51
  • There is an even faster version for this special case -- see my answer. – Sven Marnach Mar 05 '11 at 13:27
13

For this special case, you could also write a function that operates on both, NumPy arrays and plain Python floats:

def func2d(x, y):
    z = 2.0 * (x > y) - 1.0
    z *= y
    return x + z

This version is also more than four times as fast as unutbu's func2a() (tested with N = 100).

Community
  • 1
  • 1
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • +1: Well done! It looks like `func2d` is faster because it requires fewer memory allocations. Do you agree? – unutbu Mar 05 '11 at 14:00
  • @unutbu: Not sure. The first version I wrote used even less temporaries (for example by omitting `- 1.0` in the first line and using `z -= 1.0`), but this was slower. – Sven Marnach Mar 05 '11 at 14:10
  • Hm, that's odd. For me (with N=1000) using `z -= 1.0` takes 40.7ms per loop, while `func2d` takes 47.8ms. – unutbu Mar 05 '11 at 14:17
  • 1
    +1, nice one! In my machine with N= 10, 100, 1000 performance ratios are [1.144, 1.885, 1.624] between func2a/ func2d. Now it would be good to get feedback from OP. Thanks – eat Mar 06 '11 at 09:37
1

Just to get the basic idea, you may modify your function, for example this kind of way:

def func2(x, y):
    x, y= x[:, None], y[None, :]
    A= x+ y
    A[x<= y]= (x- y)[x<= y]
    return A

Thus with your case, something like this should be a very reasonable starting point:

In []: def func(x, y):
   ..:     x, y= x[:, None], y[None, :]
   ..:     return x+ y
   ..:
In []: def func2(x, y):
   ..:     x, y= x[:, None], y[None, :]
   ..:     A, L= x+ y, x<= y
   ..:     A[L]= (x- y)[L]
   ..:     return A
   ..:
In []: x, y= arange(-2, 3), arange(-2, 3)
In []: func(x, y)
Out[]:
array([[-4, -3, -2, -1,  0],
       [-3, -2, -1,  0,  1],
       [-2, -1,  0,  1,  2],
       [-1,  0,  1,  2,  3],
       [ 0,  1,  2,  3,  4]])
In []: func2(x, y)
Out[]:
array([[ 0, -1, -2, -3, -4],
       [-3,  0, -1, -2, -3],
       [-2, -1,  0, -1, -2],
       [-1,  0,  1,  0, -1],
       [ 0,  1,  2,  3,  0]])

Although this kind of processing may seem to waste resources, it's not necessarily the case. Always measure your programs actual performance and make then (not earlier) necessary changes.

IMHO for an additional advantage: this kind of 'vectorization' makes your code really consistent and readable eventually.

eat
  • 7,440
  • 1
  • 19
  • 27