0

I have a problem fitting some date with Gaussian function. I tried to do it in multiple different ways but none of them worked. I need some ideas please. The data is attached (columns 2 and 3).


import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from numpy import asarray as ar,exp



x = ar(range(19))
y = ar(0, 0, 0, 0, 0, 0, 0.01955, 1.163025, 19.7159833333333, 81.3119708333334,80.0329166666667,19.3835833333333, 0.03378, 0, 0, 0, 0, 0, 0)

#y = ar(007, 0.04, .175, .628, 1.89, 4.78,10.034,17.542, 25.589, 31.1, 31.544, 26.65, 18.74, 11.01, 5.39, 2.209, 0.74, 0.215. 0.049)


n = len(x)                          
mean = sum(x*y)/n                   
sigma = sum(y*(x-mean)**2)/n        

def gaus(x,a,x0,sigma):
    return a*exp(-(x-x0)**2/(2*sigma**2))

popt,pcov = curve_fit(gaus,x,y)
#popt,pcov = curve_fit(gaus,x,y,p0=[1,mean,sigma])

plt.scatter(x,y, color='blue')
plt.plot(x,y,label='data', marker='', color='blue', linestyle='-', linewidth=2)
plt.scatter(x,gaus(y,*popt), color='red')
plt.plot(x,gaus(y,*popt),label='fit', marker='', color='Red', linestyle='--', linewidth=2)

print(len(x))
print(mean,sigma)

plt.legend()

plt.xlabel('No of Resets', fontsize=20)
plt.ylabel('Frequency', fontsize=20)

plt.legend(loc='upper right')
plt.title('Gaussian Fit', fontsize=20)

plt.show()

HMM
  • 17
  • 3
  • 2
    The data is not attached, that's an image what is attached... In order to make your code testable and fixable, you could rather paste those numbers inside as lists for the `y = ar(...)` line. Do that for both columns, just comment one of them. – tevemadar Jun 11 '22 at 23:50
  • 1
    Just a typo: you're plotting `x` vs `gauss(y, ...)`. You need to plot `x` vs `gauss(x, ...)` – ddejohn Jun 12 '22 at 00:59
  • @ddejohn: I don't think this correct. I changed it to x vs gauss(x, ...), it didn't change anything. – HMM Jun 12 '22 at 15:31
  • Turns out this is a duplicate: https://stackoverflow.com/questions/19206332/gaussian-fit-for-python – ddejohn Jun 12 '22 at 20:42
  • @HANY that's not how functions work though. A function in the cartesian coordinate system is `y = f(x)`. By doing `gauss(y, ...)` you are trying to plot `y = f(y)` which makes no sense. The reason it didn't change anything is because your mean and sigma are incorrectly calculated. – ddejohn Jun 12 '22 at 21:07
  • Note that you can draw your lines with `marker = ".", markersize=10` -- no need to plot a scatter over a line. – ddejohn Jun 12 '22 at 21:14
  • You also generally want to plot *other* x-values, typically an `np.linspace` over the domain to show what the actual function fitted to the data looks like. – ddejohn Jun 12 '22 at 21:15
  • @ddejohn: Thank you. The code works fine now. One question remains: How calculate the uncertainly in my Gaussian fit ? Does the covariant (pcov) matrix give the answer ? -Thank you – HMM Jun 13 '22 at 22:38
  • Yes, `perr = np.sqrt(np.diag(pcov))` gives the error as described in the documentation https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html – Carlos Horn Jun 14 '22 at 06:17

1 Answers1

1

I agree with @ddejohn.

However, you are calculating the mean and std wrongly. You could use the following approximation for the integral

import numpy as np
mean = (x*(y/y.sum())).sum()
sigma = np.sqrt(((y/y.sum())*(x-mean)**2).sum())

These should be used as initial guess for the fit as in your commented line, where you can also add a0 = y.max() for the amplitude.

popt,pcov = curve_fit(gaus,x,y,p0=[a0,mean,sigma])

Then plot as @ddejohn said maybe with more sample points

xx = np.linspace(x[0], x[-1], 100)
plt.plot(xx,gaus(xx,*popt),label='fit', marker='', color='Red', linestyle='--', linewidth=2)
Carlos Horn
  • 1,115
  • 4
  • 17
  • `y.mean()` and `y.std(ddof=1)` are perfectly sufficient... Please also explicitly include how to call `curve_fit` with the proper parameters. Otherwise good answer. – ddejohn Jun 12 '22 at 20:58
  • I did not add the curve fit call with initial guess, because it is already in the code of the question, but commented out. However, good point I will add it. – Carlos Horn Jun 12 '22 at 21:33
  • @ddejohn, I guess you meant `x.mean()` is sufficient, because the mean value is in units of x, where `y(x)*dx` gives the weight or probability of the given position. – Carlos Horn Jun 13 '22 at 11:35
  • Ah, yes, a typo. – ddejohn Jun 13 '22 at 14:34