Better way to create fit functions with changing number of parameters in Python with Scipy curve_fit

Question

I am looking for a better way to use scipy's curve_fit()

I am currently using it to fit linear combinations of the parameters and a vector x.

For example, here is a fitting function I pass for trying the fit with 5 parameters, m0-m4:

def degFour(x, m0, m1, m2, m3, m4):
    return x[0]*m0 + x[1]*m1 + x[2]*m2 + x[3]*m3 + x[4]*m4

I have made more of these up to degTen using the same pattern. It does work, too.

My x vector:

[[ 1.          1.          1.          1.          1.        ]
 [ 1.          0.99990931  0.99963727  0.99918392  0.99854935]
 [ 1.          0.94872591  0.80016169  0.56954235  0.28051747]
 [ 1.          0.84717487  0.43541052 -0.10943716 -0.62083535]
 [ 1.          0.77991807  0.21654439 -0.44214431 -0.90621706]
 [ 1.          0.73162055  0.07053725 -0.62840754 -0.99004899]
 [ 1.          0.68866877 -0.05147065 -0.75956123 -0.99470154]
 [ 1.          0.64892616 -0.15778967 -0.85371386 -0.95020484]
 [ 1.          0.6114128  -0.25234877 -0.91999134 -0.8726402 ]
 [ 1.          0.57600247 -0.33644232 -0.96358568 -0.77361313]
 [ 1.          0.54225052 -0.41192874 -0.98898767 -0.66062942]
 [ 1.          0.29541145 -0.82546415 -0.78311458  0.36278212]
 [ 1.          0.09546594 -0.98177251 -0.28291761  0.92775452]
 [ 1.         -0.07539697 -0.9886306   0.22447646  0.95478091]
 [ 1.         -0.22050008 -0.90275943  0.61861713  0.62994918]
 [ 1.         -0.33964821 -0.76927818  0.86221613  0.18357784]
 [ 1.         -0.54483185 -0.40631651  0.9875802  -0.66981378]
 [ 1.         -0.71937092  0.03498904  0.66903073 -0.99755153]
 [ 1.         -1.          1.         -1.          1.        ]]

My y data:

[  3.50032   3.5007    3.6328    3.94564   4.12814   4.2651    4.39586
   4.51982   4.64394   4.76738   4.88654   5.90314   6.93304   7.99074
   9.04278  10.02426  12.01392  14.0592   18.1689 ]

Using curve_fit(degFour, xdata.T, ydata), I get the correct coefficients:

[ 9.14562709 -7.05004692  1.66932215 -0.27868686  0.02097462]

I recreate the x data depending on the degree, so I will always pass in data with the correct shape.

I tried a version of fbstj's answer regarding variable input parameters.

I used this:

def vararg(x, *args):
    return sum(a * x[i] for i, a in enumerate(args))

and ended up with this:

Traceback (most recent call last):
  File "D:/Libraries/Desktop/PScratch2/vararg.py", line 18, in <module>
    print(curve_fit(vararg, deg4kary.T, deg4ydata))
  File "C:\Python35\lib\site-packages\scipy\optimize\minpack.py", line 606, in curve_fit
    raise ValueError("Unable to determine number of fit parameters.")
ValueError: Unable to determine number of fit parameters.

As you can see from the trace, I just passed the function itself. I am stuck.

This data I've shown here is part of a larger set. The set is broken into subsets and fitted. The result fit and its first derivative must be piecewise continuous. There are other things at play, but as far as I know, I have to use this method. — Chris, May 26 '16 at 14:47
The definition of a cubic spline is a cubic piecewise continuous interpolation function with a piecewise continuous first derivative. I'd try using `scipy.interpolate.InterpolatedUnivariateSpline`. [Documentation](http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.interpolate.InterpolatedUnivariateSpline.html) — Eli Sadoff, May 26 '16 at 14:50
the curve_fit need to explicitly say the number of fit parameters — galaxyan, May 26 '16 at 15:09

score 2 · Accepted Answer · answered May 26 '16 at 20:54

You are fitting a multivariate linear model to your data. This can be expressed as a dot product between your x vector, with shape (npoints, nparams), and a single (nparams,) vector of coefficients, say m:

def linear(x, m):
    return x.dot(m)

x = np.random.randn(100, 5)
m = np.random.randn(5)

y = linear(x, m)

There's really no need to use curve_fit to get the m coefficients - it's simpler and more efficient to use np.linalg.lstsq to solve linear systems such as this:

m_hat, residuals, rank, singular_vals = np.linalg.lstsq(x, y)

Here, m_hat will be an (n_params,) vector containing the least-squares estimates of m0, m1, m2 etc. This will work for any number of coefficients.

Your answer is certainly more elegant than what I had previously implemented but I'm having trouble using it. For example, I start with two sets of values, both (m,) , then create a linear map from the independent variable to the range [-1, 1] called k. I then create an array (m, n) where n+1 is the number of coefficients I wish to use, coeff_n = [0, n]. The array values are cos(coeff_n * arccos(k)). These columns are the basis vectors, the first one being all ones. The second being k, etc... Given this setup, how would I use your method? Thanks. — Chris, Jun 06 '16 at 12:36

Better way to create fit functions with changing number of parameters in Python with Scipy curve_fit

1 Answers1