Batch call to array of functions in Python

Question

I am using scipy.interpolate.Rbf, which returns a function, to fit a large number RBF to different sets of points, and storing the output of this in a vector of functions, as follows

import scipy.interpolate as interp

for i in range(0,n):
    # Gets data points for this particular iteration
    data = get_data(i)

    # Fits RBF to data points
    zfun_smooth_rbf = interp.Rbf(data[:, 0], data[:, 1], data[:, 2], function='linear', smooth=0) 

    # Appends RBF function
    rbf_fit.append(zfun_smooth_rbf)

And then I am interested in running all the functions to regress a value for each computed RBF. Currently I use a foor loop strategy, similar to what was answered in this question, but this is far from ideal, because it basically runs this sequentially

c = [float(f(x,y) for f in self.rbf_fit]

Is there anyway to run this call with a single call? In other words, I need to call all the functions stored in an array, at the same time. Something like c = self.rbf_fit[:](x,y)?

Other than possibly speed, what's the advantage to `at the same time` versus `sequentially`? Are they interacting? — hpaulj, Jul 04 '17 at 16:54
CLick on the `[source]` link on the documentation to see the definition of the `Rbf` class. It's clear that the `function` parameter has to be a `str` or `callable`, not list or other collection of functions. Most of the code just sets up the object; the actual calculation is in the `__call__`. Focus on that if you want to streamline its use with multiple functions. — hpaulj, Jul 04 '17 at 17:06
@hpaulj it's for speed reasons, I need to call a few thousand times this functions — Dan, Jul 04 '17 at 18:33
Looking again at your problem, I realize that it's the `args` that differ in each loop, not the `function` parameter. Still I'd suggest studying that `Rbf` code to understand how those `args` (your `get_data(i)`) are used. Looks like they determine both the `rbf.xi` and `rbf.nodes` attributes. — hpaulj, Jul 04 '17 at 18:46
@hpaulj it's a fitting function, the result of course depends on the target points, and each iteration is independent from the other (ie different "random" data points to fit). — Dan, Jul 04 '17 at 19:19
Because the function is already fit when I do the batch call! interp.Rbf does the fitting and return a Radial Based Function for that particular set of data, and this function is stored (appended) rbf_fit. I do this for each set of data points. Then I want to call ALL the fitted functions at the same time. — Dan, Jul 04 '17 at 19:36

hpaulj · Answer 1 · 2017-07-05T04:18:13.083

I'm going to try to combine the __call__ of 2 rbfi into one call.

In [80]: from scipy.interpolate import Rbf

Make a sample as illustrated in the docs:

In [81]: x, y, z, d = np.random.rand(4, 50)
In [82]: rbfi0 = Rbf(x, y, z, d)
In [83]:  xi = yi = zi = np.linspace(0, 1, 20)
In [84]: di0 = rbfi0(xi, yi, zi)
In [85]: di0
Out[85]: 
array([ 0.26614249,  0.07429816, -0.01512205,  0.05134466,  0.24213774,
        0.41653342,  0.45280185,  0.34763177,  0.17681661,  0.07186139,
        0.16299749,  0.40416788,  0.641642  ,  0.78828711,  0.79709639,
        0.6530432 ,  0.42473033,  0.24155719,  0.17008326,  0.179932  ])

Make a second sample:

In [86]: x, y, z, d = np.random.rand(4, 50)
In [87]: rbfi1 = Rbf(x, y, z, d)
In [88]: di1 = rbfi1(xi, yi, zi)
In [89]: di1
Out[89]: 
array([ 0.38975158,  0.39887118,  0.42430634,  0.48554998,  0.59403568,
        0.71745345,  0.77483525,  0.70657269,  0.53545478,  0.34931526,
        0.28960157,  0.45825098,  0.7538652 ,  0.99950089,  1.14749381,
        1.19019632,  1.12736371,  1.00558691,  0.87811695,  0.77231634])

Look at the key attributes of the rbfi:

In [90]: rbfi0.nodes
Out[90]: 
array([ -13.02451018,   -3.05675802,    8.54073071,  -81.47163716,
         -5.74247623,  118.70153224,   -1.39117053,   -3.37170396,
         ....
        -10.08326243,    8.9995743 ,    3.83357612,   -4.59815344,
        -25.09981508,   -2.8753806 ,   -0.63932038,   76.59402274,
          0.26222997,  -30.35280108])
In [91]: rbfi0.nodes.shape
Out[91]: (50,)
In [92]: rbfi1.nodes.shape
Out[92]: (50,)
In [93]: rbfi0.xi.shape
Out[93]: (3, 50)
In [94]: rbfi1.xi.shape
Out[94]: (3, 50)

Construct the variables in the __call__:

In [95]: xa = np.asarray([a.flatten() for a in [xi,yi,zi]], dtype=np.float_)
In [96]: xa.shape
Out[96]: (3, 20)
In [97]: r0 = rbfi0._call_norm(xa, rbfi0.xi)
In [98]: r1 = rbfi1._call_norm(xa, rbfi1.xi)
In [99]: r0.shape
Out[99]: (20, 50)
In [100]: r1.shape
Out[100]: (20, 50)

Compute the norm for both rbfi with one call - by concatenating the xi arrays:

In [102]: r01 = rbfi0._call_norm(xa, np.concatenate((rbfi0.xi, rbfi1.xi),axis=1))
In [103]: r01.shape
Out[103]: (20, 100)
In [104]: np.allclose(r0, r01[:,:50])
Out[104]: True
In [105]: np.allclose(r1, r01[:,50:])

Now do the same for the nodes' anddot`:

In [110]: res01 = np.dot(rbfi0._function(r01), np.concatenate((rbfi0.nodes, rbfi1.nodes)))
In [111]: res01.shape
Out[111]: (20,)

Oops. We want two sets of 20; this fits it against all 100 nodes at once. I need to do some reshaping.

In [133]: r01.shape
Out[133]: (20, 100)
In [134]: r01 = r01.reshape(20,2,50)
In [135]: nodes01 = np.concatenate((rbfi0.nodes, rbfi1.nodes))
In [136]: nodes01.shape
Out[136]: (100,)
In [137]: nodes01 = nodes01.reshape(2,50) # should have just stacked them

The _function callables for the 2 rbfi differ, so I have use them separately:

In [138]: fr01 = [rbfi0._function(r01[:,0,:]), rbfi1._function(r01[:,1,:])]
In [139]: fr01[0].shape
Out[139]: (20, 50)

With more samples this list would be constructed with a list comprehension.

In [140]: fr01 = np.stack(fr01, axis=1)
In [141]: fr01.shape
Out[141]: (20, 2, 50)

Now I can do the np.dot for the combined rbfi:

In [142]: res01 = np.einsum('ijk,jk->ij', fr01, nodes01)
In [143]: res01.shape
Out[143]: (20, 2)
In [144]: np.allclose(res0, res01[:,0])
Out[144]: True
In [145]: np.allclose(res1, res01[:,1])
Out[145]: True

In [149]: di01 = np.stack([rbfi0(xi, yi, zi), rbfi1(xi, yi, zi)],axis=1)
In [150]: di01.shape
Out[150]: (20, 2)
In [151]: np.allclose(di01, res01)

So I've managed to replace the In [149] iteration with a In [138] one. I don't know if that's a time savings or not. It may depend on how costly the _function call is compared to the rest the rbfi.__call__.

In my example

In [131]: rbfi0._function
Out[131]: <bound method Rbf._h_multiquadric of <scipy.interpolate.rbf.Rbf object at 0xab002fac>>

I don't know if your parameters, function='linear', smooth=0 make a difference. If the respective _function attributes are the same, then I could replace iteration with a

rbfi0._function(r01).reshape(20,2,50)

That gives an idea of how you might speed up the iteration of the rbfi, and maybe even replace it with a 'vector' operation.

It looks like, for the default function, the difference is only in the epsilon value:

In [156]: rbfi0._function??
Signature: rbfi0._function(r)
Source:   
    def _h_multiquadric(self, r):
        return np.sqrt((1.0/self.epsilon*r)**2 + 1)
File:      /usr/local/lib/python3.5/dist-packages/scipy/interpolate/rbf.py
Type:      method
In [157]: rbfi0.epsilon
Out[157]: 0.25663331561494024
In [158]: rbfi1.epsilon
Out[158]: 0.26163317529091562

Batch call to array of functions in Python

1 Answers1