how can I calculate in curve fit the 95% confidence interval from the pcov values?

Question

What are the values I print now with sigma_ab and how can I calculate the confidence interval at 95?

for g in all:
    c0 = 5
    c2 = 0.2
    c3 = 0.7
    start = g['y'].iloc[0]
    
    p0 = np.array([c0, c2, c3]), # Construct initial guess array

    popt, pcov = curve_fit(
         model, g['x'], g['y'],
         absolute_sigma=True, maxfev=100000
    )
    
    sigma_ab = np.sqrt(np.diagonal(pcov))
    n = g.name
    print(n+' Estimated parameters: \n', popt)
    print(n + ' Approximated errors: \n', sigma_ab)

These are the estimated parameters

[0.24803625 0.06072472 0.46449578]

This is sigma_ab but I don't know exactly what it is. I would like to calculate the upper and lower limit of the mean with 95% confidence interval.

[1.32778766 0.64261562 1.47915215]

Does this answer your question? [How to get confidence intervals from curve\_fit](https://stackoverflow.com/questions/39434402/how-to-get-confidence-intervals-from-curve-fit) — mkrieger1, Feb 18 '22 at 17:28
@mkrieger1 I saw this post and used the line: sigma_ab = np.sqrt(np.diagonal(pcov)), but I am not sure what this output really means. It's the one I also show in the code above: I would expect an upper and lower limit. I could not download uncertainties — Martina Lazzarin, Feb 18 '22 at 17:33
I think this is more a statistics question than a programming question. — mkrieger1, Feb 18 '22 at 18:12

score 0 · Accepted Answer · answered Feb 20 '22 at 05:26

Your sigma_ab (sqrt of the diagonal elements of the covariance) will be the 1-sigma (68.3%) uncertainties. If the distribution of your uncertainties is strictly Gaussian (often a good but not perfect assumption, so maybe "a decent starting estimate"), then the 2-sigma (95.5%) uncertainties will be twice those values.

If you want a more detailed measure (and one that doesn't assume symmetric uncertainties), you might find lmfit and its Model class helpful. By default (and when possible) it will report 1-sigma uncertainties from the covariance, which is fast, and usually pretty good. It can also explicitly bracket and find 1-, 2-, 3-sigma uncertainties, positive and negative separately. You didn't give a very complete example, so it's a little hard to tell what your model function is doing. If you have a model function like:

def modelfunc(x, amp, cen, sigma):
    return amp * np.exp(-(x-cen)*(x-cen)/sigma**2)

you could use

import numpy as np
import lmfit

def modelfunc(x, amp, cen, sigma):
    return amp * np.exp(-(x-cen)*(x-cen)/sigma**2)

x = np.linspace(-10.0, 10.0, 201)
y = modelfunc(x, 3.0, 0.5, 1.1) + np.random.normal(scale=0.1, size=len(x))

model = lmfit.Model(modelfunc)
params = model.make_params(amp=5., cen=0.2, sigma=1)

result = model.fit(y, params, x=x)
print(result.fit_report())

# now calculate explicit 1-, 2, and 3-sigma uncertainties:
ci = result.conf_interval(sigmas=[1,2,3])
lmfit.printfuncs.report_ci(ci)

which will print out

[[Model]]
    Model(modelfunc)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 21
    # data points      = 201
    # variables        = 3
    chi-square         = 1.93360112
    reduced chi-square = 0.00976566
    Akaike info crit   = -927.428077
    Bayesian info crit = -917.518162
[[Variables]]
    amp:    2.97351225 +/- 0.03245896 (1.09%) (init = 5)
    cen:    0.48792611 +/- 0.00988753 (2.03%) (init = 0.2)
    sigma:  1.10931408 +/- 0.01398308 (1.26%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(amp, sigma) = -0.577
          99.73%    95.45%    68.27%    _BEST_    68.27%    95.45%    99.73%
 amp  :  -0.09790  -0.06496  -0.03243   2.97351  +0.03255  +0.06543  +0.09901
 cen  :  -0.03007  -0.01991  -0.00992   0.48793  +0.00991  +0.01990  +0.03004
 sigma:  -0.04151  -0.02766  -0.01387   1.10931  +0.01404  +0.02834  +0.04309

which gives explicitly calculated uncertainties, and shows that - for this case - the very fast estimates of the 1-sigma uncertainties are very good, and 2-sigma is pretty close to 2x the 1-sigma values. Like, you shouldn't really trust past the 2nd significant digit anyway...

Finally, in your example, you are actually not passing in your initial values, which illustrates a very serious flaw in curve_fit.

how can I calculate in curve fit the 95% confidence interval from the pcov values?

1 Answers1