2

As a relative beginner in Python, i'm struggling to understand (and therefore use) the "curve_fit" from scipy.optimize

I've tried following answers to previous questions: python numpy/scipy curve fitting and exponential curve fitting with python but unfortunately had no luck getting it to work.

This is an engineering problem (dealing with measured data from a test rig), therefore I know that the formula is in the format Y=(X/A)+(X/B)^(1/C) where A,B,C are the constants that need to be found

[Current Code]

import numpy as np
from scipy.optimize import curve_fit

valY_list = [yyy1,yyy2,yyy3,yyy4]
valX_list = [xxx1,xxx2,xxx3,xxx4]

val_Y = np.array(valY_list)
val_X = np.array(valX_list)

def fit_func(val_X,A,B,C):
    return (val_X/A)+((val_X/B)^(1/C))

params = curve_fit(fit_func, val_X, val_Y)

[A,B,C] = params[0]

NB1: in reality valY_list & valX_list are >500 entries long (stored as floats).

NB2: I also know that the values of A, B, C should be within a certain range of values, so I want to constrain the solution when it performs the optimisation.

0.005 < A < 0.5

0.0 < B < 5000

0.0001 < C < 0.1

I realise that my code is probably quite rudimentary, and likely has lots of things missing (or glaringly obvious mistakes for an experienced coder!) so my apologies. Any help would be appreciated!

swimfar
  • 147
  • 6
Pete Lavelle
  • 103
  • 3
  • 11
  • Have you tried providing [start values p0](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html) in the expected range for A, B, and C? Your question would also benefit from a toy dataset for which you know the answer, so people can use it to test improvements. – Mr. T May 25 '18 at 11:29
  • 3
    Ah, I see. [`^` is not the exponential operator, `**` is it in Python.](https://www.tutorialspoint.com/python3/python_basic_operators.htm) `^` is a bitwise operator. – Mr. T May 25 '18 at 11:32

1 Answers1

1

You might find lmfit (https://lmfit.github.io/lmfit-py) useful for this and other curve-fitting problems. It provides a higher-level interface to curve-fitting than curve_fit and has many convenient and advanced options for model building and working with parameters and fit statistics.

With lmfit, I would suggest the following approach:

import numpy as np
from scipy.optimize import curve_fit
from lmfit import Model

valY_list = [yyy1,yyy2,yyy3,yyy4]
valX_list = [xxx1,xxx2,xxx3,xxx4]

val_Y = np.array(valY_list)
val_X = np.array(valX_list)

def fit_func(x, a, b, c):
    return (a*x)+(b*x)**c

mymodel = Model(fit_func)

params = mymodel.make_params(a=10, b=1, c=100.0)

params['a'].min = 2.0
params['a'].max = 200.

params['b'].min = 0.002
params['b'].max = 1.e8

params['c'].min = 10.0
params['c'].max = 10000.0

result = mymodel.fit(val_Y, params, x=val_X)

print(result.fit_report())

for par_name, param in result.params.items():
    print(par_name, param.value, param.stderr)

Note that I changed your model function slightly to use 1/PARAM, and adjusted the bounds accordingly (I think!). The printed fit report will include fit statistics and the best fit values and standard errors for each of the variables. Also, note that with lmfit that Parameters are named according to the argument names of your fitting function, and that min/max bounds go with the Parameter object. Although the example above sets bounds, you could also fix any of the Parameters, with (for example):

params['c'].vary = False

And, if you really do want your A, B, and C, you can make parameters for these as constrained expressions:

params.add('A', expr='1/a')
params.add('B', expr='1/b')
params.add('C', expr='1/c')

These won't vary in the fit, but their values and standard errors will be reported. Essentially any valid python expression, using other parameter names and basic math functions can be used.

There are many more features and decent documentation and examples, but that should get you started.

M Newville
  • 7,486
  • 2
  • 16
  • 29
  • 1
    Are you the author of lmfit or why does every of your answers starts with the same copy-paste part? – Mr. T May 25 '18 at 14:51
  • 2
    I am one of the authors of lmfit. Many of my answers about curve-fitting suggest using lmfit as it conveniently provides many features that are asked about. Here, the question included placing bounds on parameters. Lmfit provides a convenient way to do that -- much better (IMO) than curve_fit, which requires keeping three separate lists that preserve the same order. That really cries out for a better data structure, which is what lmfit Parameters provides. – M Newville May 25 '18 at 15:01
  • 1
    In this case, it was the wrong operator. You should answer the question first and then say, "here is another way with lmfit". Otherwise it smells of self-promotion, for which SO has rules: ["However, you must disclose your affiliation in your answers."](https://stackoverflow.com/help/promotion). – Mr. T May 25 '18 at 15:23
  • Hm, well the user *did* ask about placing bounds on parameter values. – M Newville May 25 '18 at 17:19
  • Is an answer like this self-promotion? I view it as trying to help people with curve fitting problems. I'm certainly not paid to answer questions on SO or to support the lmfit library. I view it as community service and part of my institution's academic mission. It seems to me like it is in the mission of SO, I'll stop. I'm not sure that I would take your word for what fills that mission, but I don't discount your opinion. Would you like me to remove this answer? Would you like me to stop giving such answers to other SO questions? – M Newville May 25 '18 at 17:38
  • 1
    No. That is not my intention. a) Disclose your affiliation in your answers like other people on SO, who take care of a repository. b) Ask yourself before writing a post "Is lmfit the best answer for the OP's question?". Sometimes it is a good idea to present a different approach, sometimes not. And you are free to ignore my comment, anyhow, I am an SO member like yourself. – Mr. T May 25 '18 at 18:29
  • 2
    OK, I'm confused by what you think "affiliation" means in this context. My SO profile does not list my employer, but it does have my real last name, which is searchable, especially with any of the topics listed for these questions. Are you suggesting that one disclose that they've contributed to a library mentioned in a question or answer? That seems impractical to me. I don't recall seeing it as a common practice here, but perhaps you can you point to answers from you and/or others that demonstrate what you would like to see. – M Newville May 25 '18 at 19:32
  • 1
    I understand your confusion - the idea is to disclose the affiliation in *your own* answers that you make where you mention lmfit, not the answers of other people who might mention it. I have learned to do the same when I mention my pyeq3 fitting library or my zunzun.com web site, as I now understand such a disclosure to be a good professional practice on my part. – James Phillips May 26 '18 at 16:44