Using scipy.optimize.curve_fit does not fit curve properly

Question

I want to fit a random time series segment to a predefined list of functions. For demo purposes, I only use the sine function and a demo time series.

amplitude = 1
omega = 2
phase = 0.5
offset = 4

def sine(x, a, b, c, d):
  """Sine function"""
  return a*np.sin(b*x+c) + d

x = np.linspace(0,100, 1000)

parameters = [amplitude, omega, phase, offset]
demo_values = sine(x, *parameters)

As mentioned in the title I make use of the scipy.optimize.curve_fit method to try to find the parameters, as such:

from scipy.optimize import curve_fit

popt, err = curve_fit(f=sine, xdata=x, ydata=demo_values)

fitted = sine(x, *popt)

When comparing the curve-fitted parameters with the original parameters I find them to be quite different. I do not know what I am doing wrong.

print(f"Scipy params: {popt}")
print(f"Original params: {parameters}")
>>> Scipy params: [0.02834886 1.15624779 1.8580548  4.00011998]
>>> Original params: [1, 2, 0.5, 4]

NB. As mentioned in the introduction I do not want to find a solution for only the sine function, as I would like to extend this flow for other functions as well. I have seen on SO that using the p0 variable significantly increases the accuracy, but I do not know how to make an initial guess in a generic way (for any curve).

I tried to curve fit a simple Sine curve with the scipy.optimize.curve_fit function and expected the library to handle that quite nicely. Unfortunately, that was not the case. I have also seen that other posts related to my question use a p0 (initial guess) variable, which I don't know how to create for a generic curve.

Take a look to [How to set up the initial value for curve_fit to find the best optimizing, not just local optimizing?](https://stackoverflow.com/q/52356128/15239951). — Corralien, May 12 '23 at 12:14
_initial guess in a generic way (for any curve)_ is basically a non-starter. If it were possible, `scipy.optimize` would already be doing it; but there's a reason it asks you to fill one out instead. — Reinderien, May 12 '23 at 12:20
What does it even mean to fit a "random series" to an arbitrary function? How would the results be meaningful? — Reinderien, May 12 '23 at 12:21
Sine fits are notorious for being sensitive to the (near correct) input parameters. Since you're not setting any starting parameters, these will be at the default values of `[1, 1, 1, 1`]. — 9769953, May 12 '23 at 12:21
To echo Reinderien's comments: you can't set good starting values in a general fashion for any function. In fact, it's quite contrary to the point of fitting a curve: you want to fit your data to a model, so you already have assumptions about the input (e.g., it follows a sine pattern). Use those assumptions for your best starting point. If you just want to fit to anything, then there is no model, and the fit results, accurate or not, don't mean a thing. — 9769953, May 12 '23 at 12:27
@Reinderien my wording is probably not the best. I mean that a series that I have not seen (dont know anything about), not that the values are randomized. When I say arbitrary, I arbitrary function I mean a function from a list of functions that I should loop through to find the best fit. — Oddaspa, May 12 '23 at 12:47
@9769953 The result, if found, would be to know the function that best resembles the curve. — Oddaspa, May 12 '23 at 13:10
My point is: how is that going to help you in inferring the underlying model? What does it tell you about the data? It's a random function. I can fit an nth-polynomial to n data points; that doesn't mean anything. — 9769953, May 12 '23 at 15:35
I suggest you use some function/package to analyse the data before the fitting. For this example, an analysis could indicate that the data is periodic and only use a portion of it. Usually, the user (human) does this analysis just by looking at the plotted points... and then realizes that he doesn't need so many points... Just to illustrate my suggestion, try using only the first 20, 25 or 30 points. .. you will get exactly the predefined parameters... — Joao_PS, May 13 '23 at 10:31

Using scipy.optimize.curve_fit does not fit curve properly

0 Answers0