14

I have done some work in Python, but I'm new to scipy. I'm trying to use the methods from the interpolate library to come up with a function that will approximate a set of data.

I've looked up some examples to get started, and could get the sample code below working in Python(x,y):

import numpy as np
from scipy.interpolate import interp1d, Rbf
import pylab as P

# show the plot (empty for now)
P.clf()
P.show()

# generate random input data
original_data = np.linspace(0, 1, 10)

# random noise to be added to the data
noise = (np.random.random(10)*2 - 1) * 1e-1

# calculate f(x)=sin(2*PI*x)+noise
f_original_data = np.sin(2 * np.pi * original_data) + noise

# create interpolator
rbf_interp = Rbf(original_data, f_original_data, function='gaussian')

# Create new sample data (for input), calculate f(x) 
#using different interpolation methods
new_sample_data = np.linspace(0, 1, 50)
rbf_new_sample_data    = rbf_interp(new_sample_data)

# draw all results to compare
P.plot(original_data, f_original_data, 'o', ms=6, label='f_original_data')
P.plot(new_sample_data, rbf_new_sample_data, label='Rbf interp')
P.legend()

The plot is displayed as follows:

interpolation-plot

Now, is there any way to get a polynomial expression representing the interpolated function created by Rbf (i.e. the method created as rbf_interp)?

Or, if this is not possible with Rbf, any suggestions using a different interpolation method, another library, or even a different tool are also welcome.

E.Z.
  • 6,393
  • 11
  • 42
  • 69
  • Looking at the docs for `scipy.interpolate.Rbf` and for the `scipy.interpolate` module I can't see anything related to what you want. The only way I can think of is read the source-code and understand how the function is created depending on the parameters and then write your owns function that is able to return a readable representation. But this would depend on implementation details. Even though I can't see how you could away that. – Bakuriu Sep 26 '12 at 12:25
  • Consider two steps: data -> a curve (by Rbf or whatever), then curve -> piecewise polynomial: a spline. It's easy to save spline parameters, (npiece + 1) * 4 -- ask further if that's what you want to do. – denis Sep 10 '13 at 15:26

5 Answers5

5

The answer is no, there is no "nice" way to write down the formula, or at least not in a short way. Some types of interpolations, like RBF and Loess, do not directly search for a parametric mathematical function to fit to the data and instead they calculate the value of each new data point separately as a function of the other points.

These interpolations are guaranteed to always give a good fit for your data (such as in your case), and the reason for this is that to describe them you need a very large number of parameters (basically all your data points). Think of it this way: you could interpolate linearly by connecting consecutive data points with straight lines. You could fit any data this way and then describe the function in a mathematical form, but it would take a large number of parameters (at least as many as the number of points). Actually what you are doing right now is pretty much a smoothed version of that.

If you want the formula to be short, this means you want to describe the data with a mathematical function that does not have many parameters (specifically the number of parameters should be much lower than the number of data points). Such examples are logistic functions, polynomial functions and even the sine function (that you used to generate the data). Obviously, if you know which function generated the data that will be the function you want to fit.

Bitwise
  • 7,577
  • 6
  • 33
  • 50
  • 3
    Hello @Bitwise, of course in the code above I know the function (`f(x)=sin(2*PI*x)+noise`); in real I have a set of data in a CSV and no idea what the function is (I can only make guesses based on its shape, that's all) – E.Z. Sep 26 '12 at 14:11
  • @Eduardo yes, that is what I assumed. The point is that if you have some idea of what kind of model might fit the data, either by prior knowledge or just looking at the data, that could point you towards what function type you should use. Otherwise it could be anything and the only thing you can do is try out different families of parametric functions. – Bitwise Sep 26 '12 at 14:49
  • I slightly edited my answer and added a short example of linear interpolation. – Bitwise Sep 26 '12 at 15:04
5

The RBF uses whatever functions you ask, it is of course a global model, so yes there is a function result, but of course its true that you will probably not like it since it is a sum over many gaussians. You got:

 rbf.nodes   # the factors for each of the RBF (probably gaussians)
 rbf.xi      # the centers.
 rbf.epsilon # the width of the gaussian, but remember that the Norm plays a role too

So with these things you can calculate the distances (with rbf.xi then pluggin the distances with the factors in rbf.nodes and rbf.epsilon into the gaussian (or whatever function you asked it to use). (You can check the python code of __call__ and _call_norm)

So you get something like sum(rbf.nodes[i] * gaussian(rbf.epsilon, sqrt((rbf.xi - center)**2)) for i, center in enumerate(rbf.nodes)) to give some funny half code/formula, the RBFs function is written in the documentation, but you can also check the python code.

seberg
  • 8,785
  • 2
  • 31
  • 30
1

RBF likely stands for Radial Basis Function. I wouldn't be surprised if scipy.interpolate.Rbf was the function you're looking for.

However, I doubt you'll be able to find a polynomial expression to represent your result.

If you want to try different interpolation methods, check the corresponding Scipy documentation, that gives link to RBF, splines...

Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • Hi @Pierre, maybe I was not clear in my question. I am using `scipy.interpolate.Rbf` already. What I would like to know is if I can somehow extract a formula which represents the created interpolation function (for example `exp(x)+x^3-x^2`, etc.) – E.Z. Sep 26 '12 at 11:54
1

I don’t think SciPy’s RBF will give you the actual function. But one thing that you could do is sample the function that SciPy’s RBF gave you (ie 100 points). Then use Lagrange interpretation with those points. This will generate a polynomial function for you. Here is an example on how this would look. If you do not want to use Lagrange interpolation, You can also use “Newton’s dividend difference method” to generate a polynomial function. enter image description here

Joel
  • 21
  • 1
0

My answer is based on numpy only :

import matplotlib.pyplot as plt
import numpy as np
x_data = [324, 531, 806, 1152, 1576, 2081, 2672, 3285, 3979, 4736]
y_data = [20, 25, 30, 35, 40, 45, 50, 55, 60, 65]
x =  np.array(x_data)
y = np.array(y_data)
model = np.poly1d(np.polyfit(x, y, 2))
ynew = model(x)
plt.plot(x, y, 'o', x, ynew, '-' , )
plt.ylabel( str(model).strip())
plt.show()
Mhadhbi issam
  • 197
  • 3
  • 6