1

I have the following optimization problem. Given two np.arrays X,Y and a function K I would like to compute as fast as possible the matrix incidence gram_matrix where the (i,j)-th element is computed as K(X[i],Y[j]).

Here there an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.

def proxy_kernel(X,Y,K):
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
        for j, y in enumerate(Y):
            gram_matrix[i, j] = K(x, y)
    return gram_matrix

Any help is truly appreciated.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
dimstudio
  • 154
  • 9
  • 1
    Might be good to ask this on Code Review instead. – TigerhawkT3 May 06 '15 at 19:56
  • Without any knowledge of what `K` is, you're not going to do much better than nested loops. – user2357112 May 06 '15 at 20:53
  • Unfortunately K is passed as parameter. It's sad, because it's really slow this implementation, but so far is the only one that works. – dimstudio May 06 '15 at 21:00
  • This is a basic `numpy` how-to question; it will get more answers here than on Code Review. – hpaulj May 06 '15 at 22:01
  • This has been asked before a number of times. As long as `K` is a black-box that only takes scalars there isn't much you can do to improve the speed. But for a start, compare the times for a typical `K` with times for the simplest possible `K`. How much time is spent in iteration, and how time in evaluating `K` a zillion times? – hpaulj May 06 '15 at 22:05

3 Answers3

1

You can surely at least vectorize the inner loop:

def proxy_kernel_vect(X, Y, K):
    K_vect = np.vectorize(K)
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
        gram_matrix[i] = K_vect(x, Y)
    return gram_matrix

This yields a nice improvement with relatively long arrays:

In [15]: a = np.array(range(1000))
    ...: b = np.array(range(1000))
    ...: 

In [16]: %timeit proxy_kernel(a, b, k)
1 loops, best of 3: 665 ms per loop

In [17]: %timeit proxy_kernel_vect(a, b, k)
1 loops, best of 3: 266 ms per loop

where k is just lambda x, y: x+y.

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
  • What's your `k` here? I don't think applying `vectorize` this way actually preserves the original semantics. – user2357112 May 06 '15 at 20:47
  • @user2357112 It's defined in the last line of the answer. It's just a scalar function (since this is what the OP seems to use). – Bakuriu May 06 '15 at 20:49
  • I don't think it's supposed to be a scalar function. In the usual definition of "Gram matrix", the function is an inner product. – user2357112 May 06 '15 at 20:51
  • @Bakuriu Thanks. Unfortunately this doesn't work in my code `gram_matrix[i] = vect(x, Y) ValueError: could not broadcast input array from shape (600,2) into shape (600)` – dimstudio May 06 '15 at 20:58
1

np.vectorize does make some improvement in speed - about 2x (here I'm using math.atan2 as an black box function that takes 2 scalar arguments).

In [486]: X=np.linspace(0,1,100)
In [487]: K_vect=np.vectorize(math.atan2)

In [488]: timeit proxy_kernel(X,X,math.atan2)
100 loops, best of 3: 7.84 ms per loop

In [489]: timeit K_vect(X[:,None],X)
100 loops, best of 3: 3.49 ms per loop

In [502]: timeit np.arctan2(X[:,None],X)  # numpy compiled loop
1000 loops, best of 3: 871 µs per loop

where

def proxy_kernel(X,Y,K):
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
            for j, y in enumerate(Y):
                    gram_matrix[i, j] = K(x, y)
    return gram_matrix

As long as K is a black box, you are limited by the time it takes to invoke K the X.shape[0]*Y.shape[0] times. You can try to minimize the iteration time, but you are still limited by all those function calls.


https://stackoverflow.com/a/29733040/901925 speeds up the calculation with a Gausian kernel, by taking advantage of the axis parameter of the np.linalg.norm function.

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Given the name `gram_matrix`, I'm pretty sure `K` is supposed to be a function of two vectors, not two scalars. – user2357112 May 07 '15 at 01:30
  • On CR he references machine learning kernel functions; But he's still unclear about the dimensions; http://codereview.stackexchange.com/questions/90005/efficient-element-wise-function-computation-in-python – hpaulj May 10 '15 at 06:09
  • Why does using vectorize give any speedup? I thought vectorize just makes the code prettier but is still just a for loop internally? – rasen58 Oct 11 '18 at 22:17
  • I don't know in this case. `vectorize` uses `frompyfunc`. `frompyfunc` often tests 2x faster than an explicit loop, but `vectorize` has more overhead and often is a bit slower. But I don't dwell too much on 2x differences eitherway. What we want at the 10x differences that `np.arctan2` provides. – hpaulj Oct 12 '18 at 00:30
0

You can also try vectorize decorator from numba module.

You particular problem is easily solved using vectorize and numpy broadcasting:

from numba import vectorize

@vectorize(['float64(float64, float64)'])
def K(x,y):
    return x + y

a = np.arange(1000)
b = np.arange(1000)

gram_array = K(a[None,:], b[:, None])
btel
  • 5,563
  • 6
  • 37
  • 47