0
from scipy.optimize import curve_fit

def func(x, a, b):
    return a * x + b

x = [0, 1, 2, 3]
y = [160, 317, 3302, 16002]
yerr = [0.0791, 0.0562, 0.0174, 0.0079]

curve_fit(func, x, y, sigma=yerr)

I am trying to do a weighed least square linear regression with the code above using scipy's curve_fit function. From this I need to get the slope and the error of the fit. I think I figured out how to get the slope using the below code, but I don't know how to extract the error of the fit.

# Code to get the slope (correct me if i'm wrong!!)
popt = curve_fit(func, x, y, sigma=yerr)
slope = popt[0]

Thank you!

======================================

Edit: I have been doing some research and I think I might have figured everything out for myself. I will do my best to explain it!

From the image below, I provide this function to curve_fit with a and b being my parameters corresponding to slope and intercept respectively.

When you use curve fit it returns a 1D array, popt, and a 2D array pcov. popt contains the optimization of my provided parameters, so in this case, popt[0] is slope (green) and popt[1] is intercept (red).

Next, the pcov matrix represents covariance. What I'm mainly looking for here are the values on the diagonal, as these correspond to the variance of each parameter. Again I color coded these so that slope error is pcov[0,0] (green) and intercept error is pcov[1,1] (red).

enter image description here

From this I was able to get the slope of my line and its error. I hope this explanation can be helpful to someone else!!

Paul
  • 165
  • 1
  • 12
  • That code would be wrong.`curve_fit`returns `(popt, pcov)` So your `popt[0]` would be `curve_fit`s `popt` – mikuszefski Sep 23 '21 at 14:28
  • That said, one needs to point out that you unnecessarily use non-linear optimization for a simple regression. – mikuszefski Sep 23 '21 at 14:35
  • So what would a better solution be? I was under the impression that I needed curve_fit so I could include the error in my y values. In truth I don't fully understand this function! – Paul Sep 23 '21 at 16:09

1 Answers1

1

As said in the comments:

curve_fit is a non linear fit that is definitively not necessary to make a linear regression.

If however used, your code would need to look like:

popt, pcov = curve_fit(func, x, y, sigma=yerr)
slope = popt[0]

That said, it is better to use the linear approach. One approach is given here, with the explanation going like this:

Use numpy.linalg.lstsq to minimize A x = b, where the math behind is using the minimization of ( A x - b ).T ( A x - b ) The weighted problem is ( A x - b ).T W ( A x - b ) With W being diagonal we can write it as W = V.T V so one has

( A x - b ).T W ( A x - b ) =

= ( A x - b ).T V.T V ( A x - b )

= (V A x - V b ).T (V A x - V b )

Hence, it is reduced to the original problem with A -> V A and b -> V b

Note,Wdoes not need to be diagonal, but will be real and symmetric and if an inverse exists we will be able to diagonalize it to W = Q D Q.T with Q being orthogonal and D diagonal.

mikuszefski
  • 3,943
  • 1
  • 25
  • 38