44

In python I have a function which has many parameters. I want to fit this function to a data set, but using only one parameter, the rest of the parameters I want to supply on on my own. Here is an example:

def func(x,a,b):
   return a*x*x + b

for b in xrange(10):
   popt,pcov = curve_fit(func,x1,x2)

In this I want that the fitting is done only for a and the parameter b takes the value of the loop variable. How can this be done?

Cœur
  • 37,241
  • 25
  • 195
  • 267
lovespeed
  • 4,835
  • 15
  • 41
  • 54
  • You should look at http://en.wikipedia.org/wiki/Curve_fitting – ninjagecko Aug 31 '12 at 03:59
  • 1
    There're infinite ways to define what it means to "fit" a curve, and for each method, many ways to implement it. The type of curve-fitting you want is often dependent on the problem you're trying to solve. Assuming you don't care, one simple way is called least squares, which minimizes the sum of the squares of the errors. Here is a pre-made library that calculates the solution to a "damped" least squares: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html Question is incomplete though; I suggest to close and reopen with a specific question about curve-fitting. – ninjagecko Aug 31 '12 at 04:01
  • 5
    I don't care about the algorithm, I will just use the curve_fit from scipy.optimize. What I can't understand is the where should I specify that the one of the parameters should take my value and which parameter should it fit? – lovespeed Aug 31 '12 at 04:18
  • 8
    @ninjagecko His question is very specific and has a very clear purpose. He is not asking how the process of curve fitting works. – PaulMag Nov 21 '14 at 15:36
  • I have [another suggestion](https://stackoverflow.com/questions/58463550/using-scipy-curve-fit-with-variable-number-of-parameters-to-optimize/58463551#58463551) which might be more intuitive – Alejandro Oct 19 '19 at 12:15

7 Answers7

64

You can wrap func in a lambda, as follows:

def func(x, a, b):
   return a*x*x + b

for b in xrange(10):
   popt, pcov = curve_fit(lambda x, a: func(x, a, b), x1, x2)

A lambda is an anonymous function, which in Python can only be used for simple one line functions. Basically, it's normally used to reduce the amount of code when don't need to assign a name to the function. A more detailed description is given in the official documentation: http://docs.python.org/tutorial/controlflow.html#lambda-forms

In this case, a lambda is used to fix one of the arguments of func. The newly created function accepts only two arguments: x and a, whereas b is fixed to the value taken from the local b variable. This new function is then passed into curve_fit as an argument.

Anton Beloglazov
  • 4,939
  • 1
  • 21
  • 9
6

A better approach would use lmfit, which provides a higher level interface to curve-fitting. Among other features, Lmfit makes fitting parameters be first-class objects that can have bounds or be explicitly fixed (among other features).

Using lmfit, this problem might be solved as:

from lmfit import Model
def func(x,a,b):
   return a*x*x + b

# create model
fmodel = Model(func)
# create parameters -- these are named from the function arguments --
# giving initial values
params = fmodel.make_params(a=1, b=0)

# fix b:
params['b'].vary = False

# fit parameters to data with various *static* values of b:
for b in range(10):
   params['b'].value = b
   result = fmodel.fit(ydata, params, x=x)
   print(": b=%f, a=%f+/-%f, chi-square=%f" % (b, result.params['a'].value, 
                                             result.params['a'].stderr,
                                             result.chisqr))
M Newville
  • 7,486
  • 2
  • 16
  • 29
2

Instead of using the lambda function which might be less intuitive to digest I would recommend to specify the scikit curve_fit parameter bounds that will force your parameter to be searched within custom boundaries.

All you have to do is to let your variable a move between -inf and +inf and your variable b between (b - epsilon) and (b + epsilon)

In your example:

epsilon = 0.00001

def func(x,a,b):
    return a*x*x + b

for b in xrange(10):
    popt,pcov = curve_fit(func,x1,x2, bounds=((-np.inf,b-epsilon), (np.inf,b+epsilon))
bobo32
  • 992
  • 2
  • 9
  • 21
  • 1
    this gives the illusion of 2 variables when in fact the goal is to have 1. I recommend strongly against this approach. – M Newville May 23 '18 at 23:47
  • There is no illusion. The usage of Epsilon is to recall the definition of the mathematical limit and it is less invasive than importing a new library. – bobo32 May 24 '18 at 14:08
  • 3
    When assigning the number of variables for statistical purposes (like reduced chi-square) is the number of variables 1 or 2? With your use of the default `curve_fit(..., absolute_sigma=False)`, the returned value ( for the covariance matrix (your `pcov`) will be 2x2 and rescaled assuming there are 2 variables, even though there is 1 actual variable. So the uncertainties for `a` will be incorrectly estimated. If `b` is not a variable, do not make it a variable with no freedom to change it values. – M Newville May 24 '18 at 17:16
1

I effectively use Anton Beloglazov's solution, though I like to avoid using lambda functions for readability so I do the following:

def func(x,a,b):
   return a*x*x + b

def helper(x,a):
   return func(x,a,b)

for b in xrange(10):
   popt,pcov = curve_fit(helper, x1, x2)

This ends up being reminiscent of Rick Berg's answer, but I like having one function dedicated to the "physics" of the problem and a helper function to get the code to work.

Francisco C
  • 193
  • 3
  • 7
1

Another way is to use upper and lower bounds that are identical (+ eps) as the initial value. Using the same example with initial conditions and bounds:

def func(x,a,b):
   return a*x*x + b
# free for a and b
popt,pcov = curve_fit(func, x1, x2, 
                      p0=[1,1], 
                      bounds=[(-inf,-inf),(inf,inf)])

# free for a; fixed for b  ; 
eps=1/100
popt,pcov = curve_fit(func, x1, x2, 
                      p0=[1,1], 
                      bounds=[(-inf,(1-eps)),(inf,(1+eps))])

Remember to insert an epsilon, otherwise, a and b must be the same

0

There is a simpler option if you are willing/able to edit the original function.

Redefine your function as:

def func(x,a):
    return a*x*x + b

Then you can simply put it in your loop for parameter b:

for b in xrange(10):
   popt,pcov = curve_fit(func, x1, x2)

Caveat: the function needs to be defined in the same script in which it is called for this to work.

Rick Berg
  • 148
  • 2
  • 10
0

Scipy's curve_fit takes three positional arguments, func, xdata and ydata. So an alternative approach (to using a function wrapper) is to treat 'b' as xdata (i.e. independent variable) by building a matrix that contains both your original xdata (x1) and a second column for your fixed parameter b.

Assuming x1 and x2 are arrays:

def func(xdata,a):
   x, b = xdata[:,0], xdata[:,1]  # Extract your x and b
   return a*x*x + b

for b in xrange(10): 
   xdata = np.zeros((len(x1),2))  # initialize a matrix
   xdata[:,0] = x1  # your original x-data
   xdata[:,1] = b  # your fixed parameter
   popt,pcov = curve_fit(func,xdata,x2)  # x2 is your y-data
Arjan Groen
  • 604
  • 8
  • 16