1

I am not an experienced programmer, and needed to fit my data values into a Gaussian graph. The code below was from Gaussian fit for Python.

Using Anaconda, this error message was obtained:

runfile('D:/Anaconda3/gaussian.py', wdir='D:/Anaconda3')

C:\Users\lion\Anaconda3\lib\sitepackages\scipy\optimize\minpack.py:779:OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning).

The fit gaussian graph was also not as expected; a horizontal curve instead of a gaussian fit curve.

Graph Image: Graph Image

Any help would be appreciated!

Code Used:

import pylab as plb
import matplotlib.pyplot as plt

from scipy.optimize import curve_fit
from scipy import asarray as ar,exp


x = ar(range(399))
y = ar([1, 0, 1, 0, 2, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 2, 1, 0, 2, 1, 1, 0, 1, 0, 1, 0, 0, 2, 0, 2, 
        3, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 2, 0, 3, 3, 0, 1, 2, 1, 1, 2, 3, 1, 
        4, 1, 1, 1, 0, 1, 0, 3, 4, 0, 1, 3, 0, 2, 0, 3, 0, 0, 0, 3, 2, 0, 2, 0, 0, 1, 2, 0, 0, 0, 
        3, 2, 1, 0, 1, 3, 4, 2, 4, 1, 2, 1, 1, 2, 0, 2, 2, 6, 2, 4, 2, 0, 1, 2, 2, 3, 4, 6, 2, 3, 
        2, 4, 1, 4, 9, 6, 6, 4, 5, 6, 3, 7, 8, 9, 7, 8, 8, 4, 10, 10, 12, 9, 18, 18, 16, 14, 13, 13, 
        15, 17, 16, 26, 24, 37, 34, 36, 40, 48, 52, 52, 50, 56, 68, 90, 71, 107, 93, 117, 134, 207, 
        200, 227, 284, 287, 337, 379, 449, 471, 626, 723, 848, 954, 1084, 1296, 1481, 1676, 1898, 2024, 
        2325, 2692, 3110, 3384, 3762, 4215, 4559, 5048, 5655, 6092, 6566, 6936, 7513, 8052, 8414, 9016, 
        9303, 9598, 9775, 10100, 10265, 10651, 10614, 10755, 10439, 10704, 10233, 10086, 9696, 9467, 9156, 
        8525, 8200, 7609, 7156, 6678, 6160, 5638, 5227, 4574, 4265, 3842, 3380, 3029, 2767, 2512, 2018, 1856, 1645, 
        1463, 1253, 1076, 943, 787, 711, 588, 512, 448, 361, 304, 303, 251, 190, 185, 154, 134, 114, 105, 
        86, 88, 83, 79, 50, 60, 49, 28, 33, 37, 28, 31, 22, 
        14, 26, 19, 17, 15, 9, 17, 13, 11, 11, 12, 18, 8, 6, 9, 6, 3, 6, 6, 6, 6, 11, 9, 15, 
        3, 3, 1, 2, 3, 2, 6, 3, 4, 3, 4, 5, 3, 1, 1, 2, 1, 0, 4, 3, 2, 3, 1, 3, 3, 4, 0, 3, 5, 0, 
        3, 1, 2, 0, 2, 2, 1, 0, 5, 1, 3, 0, 0, 3, 0, 1, 3, 0, 1, 0, 2, 4, 0, 0, 0, 0, 0, 1, 0, 2, 
        0, 1, 1, 0, 1, 0, 0, 4, 0, 0, 0, 2, 0, 3, 0, 2, 1, 2, 2, 0, 0, 0, 1, 4, 1, 0, 1, 2, 0, 1, 
        1, 1, 1, 2, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1])

n = len(x)                          #the number of data
mean = sum(x*y)/n                   #note this correction
sigma = sum(y*(x-mean)**2)/n        #note this correction

def gaus(x,a,x0,sigma):
    return a*exp(-(x-x0)**2/(2*sigma**2))

popt,pcov = curve_fit(gaus,x,y,p0=[1,mean,sigma])

plt.plot(x,y,'b+:',label='data')
plt.plot(x,gaus(x,*popt),'ro:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Intensity (Counts)')

plt.show()
WorldSEnder
  • 4,875
  • 2
  • 28
  • 64
Fad Lion
  • 11
  • 3
  • I'd have to take a closer look when I'm at a computer, but just looking at your plot, you might try bounding your data more tightly to cut off those long, nearly flat tails which are probably what's throwing off the covariance estimate. – Iguananaut Oct 14 '17 at 21:48
  • Unrelated, but some tips on using numpy: instead of importing `asarray` from the scipy module (as really it's party of Numpy) use numpy directly: `import numpy as np`. Instead of `ar(range(N))` you can use just `np.arange(N)` to make a monotonically increasing array. Also use `np.sum()` instead of the Python built-in sum. For a few hundred points it doesn't much matter, but beyond that the difference can be orders of magnitude. – Iguananaut Oct 14 '17 at 21:53
  • I see now you were just copying someone else's code. To be clear, all I mean by "bound" is to try passing, say, `x[100:300]` to `curve_fit` and similarly for `y`. – Iguananaut Oct 14 '17 at 21:59

2 Answers2

0

When fitting a Gaussian, correctly computing μ and σ is kind of a big deal. You cited your reference (thank you!), but didn't follow it. Please import math and substitute these two lines into your code:

mean = sum(x * y) / sum(y)
sigma = math.sqrt(sum(y * (x - mean)**2) / sum(y))

(It turns out if mean is any number between -7200 and 10,000 the fit still converges nicely, and similarly for positive sigmas less than 805.)

Gaussian fit

J_H
  • 17,926
  • 4
  • 24
  • 44
0

You may find lmfit (http://lmfit.github.io/lmfit-py/) useful for this sort of problem. It has builtin models for common peak shapes like Gaussian and simplifies many curve-fitting tasks. You example would look like this (skipping the data):

import matplotlib.pyplot as plt

from lmfit.models import GaussianModel

gmodel = GaussianModel()
params = gmodel.guess(y, x=x)
result = gmodel.fit(y, params, x=x)

print(result.fit_report())

plt.plot(x, y,'b+:',label='data')
plt.plot(x, result.best_fit, 'ro:', label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Intensity (Counts)')
plt.show()

which would print out the fit report of

[[Model]]
    Model(gaussian)
[[Fit Statistics]]
    # function evals   = 23
    # data points      = 399
    # variables        = 3
    chi-square         = 1130482.300
    reduced chi-square = 2854.753
    Akaike info crit   = 3177.728
    Bayesian info crit = 3189.695
[[Variables]]
    sigma:       13.0637138 +/- 0.019425 (0.15%) (init= 14.5)
    center:      210.606022 +/- 0.019425 (0.01%) (init= 210.5)
    amplitude:   3.4580e+05 +/- 445.3134 (0.13%) (init= 467842.5)
    fwhm:        30.7626945 +/- 0.045744 (0.15%)  == '2.3548200*sigma'
    height:      10560.0217 +/- 13.59907 (0.13%)  == '0.3989423*amplitude/max(1.e-15, sigma)'
[[Correlations]] (unreported correlations are <  0.100)
    C(sigma, amplitude)          =  0.577

and produce a plot showing a good fit.

Note that the definition of Gaussian is slightly different, and your a would correspond to the value of height listed above. The value listed as amplitude would be the integrated area.

M Newville
  • 7,486
  • 2
  • 16
  • 29