0

I have a bunch of code that isolates a mass spectrometry peak from a spectrum and have placed the values of the peak into two lists. Gaussian_x and gaussian_x. I now need to fit this curve, ideally using Levenberg-Marquardt or similar algorithm so that I can then calculate the peak area once it's fitted.

There isn't much help when it comes to bi-gaussian peaks (Peaks that have different values of sigma for the first and second half of the peak) and so I'm struggling to find a way to get it to work.

yvals = np.asarray(gaussian_y)
model = SkewedGaussianModel()

# set initial parameter values
params = model.make_params(amplitude=a, center=b, sigma=c, gamma=d)

# adjust parameters to best fit data.
result = model.fit(yvals, params, x=xvals)

print(result.fit_report())
plt.plot(xvals, yvals)
plt.plot(xvals, result.init_fit)
plt.plot(xvals, result.best_fit)
plt.show()

When I plot this, there's just a straight line on the graph at y=0 and then the plot of the two lists. I've no idea why the yvalues aren't registering with the best_fit and init_fit.

Here's the full code:

import pymzml
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import SkewedGaussianModel
import filepath

TARGET_MASS = 152
MASS_WIDTH = 1

PEAK_CENTER = 0.9
PEAK_WIDTH = 0.1

gaussian_x = []
gaussian_y = []

a = 1
b = 400
c = 100
d = 280


def main(in_path):

    x_array = []
    y_array = []

    run = pymzml.run.Reader(in_path)
    for spectrum in run:

        if spectrum.ms_level == 1:
            mass_array = []
            intensity_array = []
            target_array = []

            # change tuple into array.
            my_tuple = spectrum.peaks("raw")
            my_array = np.asarray(my_tuple)

            # append the first element of each index to mass array and second element to intensity array.
            for i in my_array:
                mass_array.append(i[0])
                intensity_array.append(i[1])

            # for index and value in mass_array, absolute value of mass - target mass taken.
            # append the abs_mass values to a separate array.
            for i, mass in enumerate(mass_array):
                abs_mass = abs(mass - TARGET_MASS)
                target_array.append(abs_mass)

                # if the absolute mass values are less than selected mass_width, append the value in the intensity_array
                # at that same index same index
                if abs_mass < MASS_WIDTH:
                    y_array.append(intensity_array[i])
                    x_array.append(spectrum.scan_time_in_minutes())

    new_x = []
    new_y = []

    # loop through x values
    for i, x1 in enumerate(x_array):
        # temporary store for where there are multiple y values for the same x value
        y_temp = []

        # ensure we only run through a search for each UNIQUE x value once
        if x1 in new_x:
            # moves onto the next for loop interation without executing the below
            continue

        # add unique x value to new_x
        new_x.append(x1)

        # loop through the x values again, looking for where there are duplicates
        for j, x2 in enumerate(x_array):
            if x1 == x2:
                # where we find a duplicate entry, add the value of y to a temporary array
                y_temp.append(y_array[j])

        # after accumulating all values of y for the same x value,
        # add it to the new_y array
        new_y.append(max(y_temp))

    lower_bound = PEAK_CENTER - (PEAK_WIDTH / 2)
    upper_bound = PEAK_CENTER + (PEAK_WIDTH / 2)

    # for each index and value in retention time array
    for i, x in enumerate(new_x):
        # if x greater than the lower bound and smaller than the upper bound then append x and y value
        # to new lists
        if lower_bound < x < upper_bound:
            gaussian_x.append(x)
            gaussian_y.append(new_y[i])

        # if x greater than upper bound, stop function
        if x > upper_bound:
            break

    xvals = np.asarray(gaussian_x)
    yvals = np.asarray(gaussian_y)
    model = SkewedGaussianModel()

    # set initial parameter values
    params = model.make_params(amplitude=a, center=b, sigma=c, gamma=d)

    # adjust parameters to best fit data.
    result = model.fit(yvals, params, x=xvals)

    print(result.fit_report())
    plt.plot(xvals, yvals)
    plt.plot(xvals, result.init_fit)
    plt.plot(xvals, result.best_fit)
    plt.show()


if __name__ == "__main__":
    main(in_path)

Fit report says:

[[Model]]
    Model(skewed_gaussian)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 5
    # data points      = 22
    # variables        = 4
    chi-square         = 1.6213e+10
    reduced chi-square = 9.0074e+08
    Akaike info crit   = 457.197124
    Bayesian info crit = 461.561294
##  Warning: uncertainties could not be estimated:
    amplitude:  at initial value
    center:     at initial value
    sigma:      at initial value
    gamma:      at initial value
[[Variables]]
    amplitude:  1.00000000 (init = 1)
    center:     400.000000 (init = 400)
    sigma:      100.000000 (init = 100)
    gamma:      280.000000 (init = 280)
    height:     0.00398942 == '0.3989423*amplitude/max(2.220446049250313e-16, sigma)'
    fwhm:       235.482000 == '2.3548200*sigma'
  • please post a complete example. For sure, the values of `a`, `b`, etc are important to know -- are your initial values reasonable? Also: what does the fit report say? – M Newville Mar 25 '20 at 17:42
  • the example does not run. It has a nonstandard reader and way too much code parsing and selecting data (or something). The fit results suggest your starting values (which are selected independent of the data) are so far off that the fit cannot refine these values. Without a mvce (https://stackoverflow.com/help/asking) we cannot know why that is. – M Newville Mar 27 '20 at 12:18

1 Answers1

0

Seems like the initial parameters of your fit is not good enough. See the following discussion for a better guess for your parameters; https://stackoverflow.com/a/19207683/13131172

Can
  • 117
  • 3
  • Are there any other methods I can use to fit to a bi-gaussian curve? There doesn't seem to be many solutions to bi-gaussian curves at all. – Tommy Holt Mar 28 '20 at 11:12