Improving Gaussian fitting using curve_fit from scipy and python 3.x

Question

I believe I am successfully implementing gaussian fitting using curve fit from scipy. But the problem that I am running into is that... the fit isn't so great, because the optimized parameter is changing the centroid.

    data =np.loadtxt('mock.txt')
    my_x=data[:,0]
    my_y=data[:,1]

    def gauss(x,mu,sigma,A):
        return A*np.exp(-(x-mu)**2/2/sigma**2)
    def trimodal_gauss(x,mu1,sigma1,A1,mu2,sigma2,A2,mu3,sigma3,A3):
        return gauss(x,mu1,sigma1,A1)+gauss(x,mu2,sigma2,A2)+gauss(x,mu3,sigma3,A3)



    """""
    Gaussian fitting parameters recognized in each file
    """""
    first_centroid=(10180.4*2+9)/9
    second_centroid=(10180.4*2+(58.6934*1)+7)/9
    third_centroid=(10180.4*2+(58.6934*2)+5)/9
    centroid=[]
    centroid+=(first_centroid,second_centroid,third_centroid)

    apparent_resolving_power=1200
    sigma=[]
    for i in range(len(centroid)):
        sigma.append(centroid[i]/((apparent_resolving_power)*2.355))

    height=[1,1,1]

    p=[]    

    p = np.array([list(t) for t in zip(centroid, sigma, height)]).flatten() 


    popt, pcov = curve_fit(trimodal_gauss,my_x,my_y,p0=p)

output:

I understand that there are a lot of peaks here but I really need it to fit only three Gaussians but at the right centroid(given in my initial guess). In other words, I really don't hope that the centroid I give is not changing. Has anyone encountered a challenge as such? and could please help me what I could do to make it happen?

In general, it seems you'd better fit the correct number of peaks (at least 5, maybe 6) and only take the three results that you actually care about. Your current approach will do a bad job since the peaks you are not fitting will influence the result for the three peaks that you care about. It will "think" that the additional stuff is part of the three peaks. — NichtJens, Oct 21 '17 at 00:51

NichtJens · Answer 1 · 2017-10-21T14:30:30.693

You should define three separate functions with fixed values for the centers. Then you fit the sum function of these functions for only the leftover parameters.

Simply put, your trimodal_gauss() should not take mus but only As and sigmas. The mus should be constants.

A trivial (but not very general) way of doing this is:

def trimodal_gauss(x, sigma1, A1, sigma2, A2, sigma3, A3):
    mu1 = 1234 # put your mu's here
    mu2 = 2345
    mu3 = 3456
    g1 = gauss(x, mu1, sigma1, A1)
    g2 = gauss(x, mu2, sigma2, A2)
    g3 = gauss(x, mu3, sigma3, A3)
    return g1 + g2 + g3

From this one can generalize the idea by a "generator" for trimodal_gauss functions that takes the three (or n?) mus and creates the function of the other parameters. Like so:

def make_trimodal_gauss(mu1, mu2, mu3):
    def result_function(x, sigma1, A1, sigma2, A2, sigma3, A3):
        g1 = gauss(x, mu1, sigma1, A1)
        g2 = gauss(x, mu2, sigma2, A2)
        g3 = gauss(x, mu3, sigma3, A3)
        return g1 + g2 + g3
    return result_function


mu1 = 1234 # put your mu's here
mu2 = 2345
mu3 = 3456

trimodal_gauss = make_trimodal_gauss(mu1, mu2, mu3)

#usage like this: trimodal_gauss(x, sigma1, A1, sigma2, A2, sigma3, A3)

thanks for your comment. If mu isn't a part of the function, how are we going to ensure that gaussian will be fit at the centroid of interest? — user7852656, Oct 21 '17 at 00:19

score -1 · Accepted Answer · answered Oct 21 '17 at 20:07

If you use the lmfit module (https://github.com/lmfit/lmfit-py), you can easily put bounds on the centroids of your Gaussian functions, or even fix them. Lmfit also makes it easy to build up multi-peak models.

You didn't give a complete example or link to your data, but a fit with lmfit to your data might look like this:

import numpy as np
from lmfit import GaussianModel
data =np.loadtxt('mock.txt')
my_x=data[:,0]
my_y=data[:,1]

model = ( GaussianModel(prefix='p1_') +
          GaussianModel(prefix='p2_') +
          GaussianModel(prefix='p3_') )

params = model.make_params(p1_amplitude=100, p1_sigma=2, p1_center=2262,
                           p2_amplitude=100, p2_sigma=2, p2_center=2269,
                           p3_amplitude=100, p3_sigma=2, p3_center=2276,
                       )

# set boundaries on the Gaussian Centers:
params['p1_center'].min = 2260
params['p1_center'].max = 2264

params['p2_center'].min = 2267
params['p2_center'].max = 2273

params['p3_center'].min = 2274
params['p3_center'].max = 2279

# or you could just fix one of the centroids like this:
params['p3_center'].vary = False

# if needed, you could force all the sigmas to be the same value
# or related by simple mathematical expressions
params['p2_sigma'].expr = 'p1_sigma'
params['p3_sigma'].expr = '2*p1_sigma'

# fit this model to data:
result  = model.fit(my_y, params, x=my_x)

# print results
print(result.fit_report())

# evaluate individual gaussian components:
peaks = model.eval_components(params=result.params, x=my_x)

# plot results:
plt.plot(my_x, my_y, label='data')
plt.plot(my_x, result.best_fit, label='best fit')
plt.plot(my_x, peaks['p1_'])
plt.plot(my_x, peaks['p2_'])
plt.plot(my_x, peaks['p3_'])
plt.show()

`scipy.curve_fit` knows bounds just as well in current version. See here: https://stackoverflow.com/questions/16760788/python-curve-fit-library-that-allows-me-to-assign-bounds-to-parameters — NichtJens, Oct 21 '17 at 22:52

Improving Gaussian fitting using ***curve_fit*** from scipy and python 3.x

2 Answers2

Improving Gaussian fitting using curve_fit from scipy and python 3.x