How to form a baseline, subtract the baseline from the original plot and repeat the baseline subtraction across 160 data frames?

Question

Although other threads have talked about finding the baseline and subtracting it from their data, these threads only did it over one graph and used some sort of fitting. My application is similar, but different and their code did not work for mine. Essentially, I want to take my original data, form a baseline, subtract the original data from the baseline, take the maximum of each data frame and plot the maximums. The biggest problem right now for me is being able to form a baseline that can be looped throughout every data frame (there are 160 data frames).

So I have tried other methods using more complex fitting or algorithms, but none of them worked or were too hard to implement into my simpler graphs. So far, I am able to find and plot all 160 maxima. All I need help with is subtracting the background so that the maxima are more in line with each other.

filestoprocess = []
peak 1 = []

for filename in filestoprocess:
    dfspectra = pd.read_csv(filename, skiprows = 13, delimiter =                                 '\t', header = None, names = ['Wavelength (nm)','Absorbance'])
    ymax1 = np.max(dfspectra['Absorbance'][162:218])
    peak1.append(ymax1)

time = range(0,160)   
x = np.array(time)
ax.plot(x, peak1)

This is the code I have so far. It simply puts the maxima into an array and plots the array. I have no idea how to start with making another baseline, subtracting it from each data frame and then plotting those maxima.

For some reason, I am not able to upload my plot. But it is simply an absorbance spectra that has a gaussian shape that flattens out. I don't know if some sort of gaussian fit is necessary for baseline subtraction. But I expect the baseline to simply be a more stable and flat version of the spectra I have, which will aid in normalizing the maxima.

EDIT*:

Heres code that I saw in another thread that I tried, but wasn't sure if I was using it correctly. I don't know if it would work with my application

def baseline_als(y, lam, p, niter=10):
  L = len(y)
  D = sparse.csc_matrix(np.diff(np.eye(L), 2))
  w = np.ones(L)
  for i in xrange(niter):
    W = sparse.spdiags(w, 0, L, L)
    Z = W + lam * D.dot(D.transpose())
    z = spsolve(Z, w*y)
    w = p * (y > z) + (1-p) * (y < z)
  return z

What do you mean when you say that the baseline correction methods you tried did not work ? Can you provide an example (with the code)? — Ludovick Bégin, Jun 25 '19 at 20:21
I tried this method here: https://stackoverflow.com/questions/29156532/python-baseline-correction-library. This was similar to what I wanted, but it involved fitting that I wasn't sure would apply to my dataframe. I also wasn't sure how to manipulate the code to work with my code. def baseline_als(y, lam, p, niter=10): L = len(y) D = sparse.csc_matrix(np.diff(np.eye(L), 2)) w = np.ones(L) for i in xrange(niter): W = sparse.spdiags(w, 0, L, L) Z = W + lam * D.dot(D.transpose()) z = spsolve(Z, w*y) w = p * (y > z) + (1-p) * (y < z) return z — Brandon Tran, Jun 25 '19 at 21:13
you should use the second answer working with Python 3. Then in your code, if you have something like `y = dfspectra['Absorbance']`, then you should add something like `baseline = baseline_als(y, lam=10**5, p=0.01)` `y_corrected = y - baseline` `ymax = np.max(y_corrected)` -- remember to play around with `p` and `lam` to better fit your data (cues: 0.001 ≤ p ≤ 0.1 10^2 ≤ lam ≤ 10^9). — Ludovick Bégin, Jun 25 '19 at 22:15

How to form a baseline, subtract the baseline from the original plot and repeat the baseline subtraction across 160 data frames?

0 Answers0