Finding only the "prominent" local maxima of a 1d array

Question

I have a couple of data sets with clusters of peaks that look like the following: You can see that the main features here a clusters of peaks, each cluster having three peaks. I would like to find the x values of those local peaks, but I am running into a few problems. My current code is as follows:

import numpy as np
import matplotlib.pyplot as plt
from scipy import loadtxt, optimize
from scipy.signal import argrelmax

def rounddown(x):
    return int(np.floor(x / 10.0)) * 10

pixel, value = loadtxt('voltage152_4.txt', unpack=True, skiprows=0)

ax = plt.axes()
ax.plot(pixel, value, '-')
ax.axis([0, np.max(pixel), np.min(value), np.max(value) + 1])

maxTemp = argrelmax(value, order=5)
maxes = []
for maxi in maxTemp[0]:
    if value[maxi] > 40:
        maxes.append(maxi)

ax.plot(maxes, value[maxes], 'ro')

plt.yticks(np.arange(rounddown(value.min()), value.max(), 10))
plt.savefig("spectrum1.pdf")
plt.show()

Which works relatively well, but still isn't perfect. Some peaks labeled: The main problem here is that my signal isn't smooth, so a few things that aren't actually my relevant peaks are getting picked up. You can see this in the stray maxima about halfway down a cluster, as well as peaks that have two maxima where in reality it should be one. You can see near the center of the plot there are some high frequency maxima. I was picking those up so I added in the loop only considering values above a certain point.

I am afraid that smoothing the curve will actually make me loose some of the clustered peaks that I want, as in some of my other datasets there are even closer together. Maybe my fears are unfounded, though, and I am just misunderstanding how smoothing works. Any help would be appreciated.

Does anyone have a solution on how to pick out only "prominent" peaks? That is, only those peaks that are quick large compared to the others?

`scipy` has a built-in peak detector: [`scipy.signal.find_peaks_cwt`](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.signal.find_peaks_cwt.html) which uses wavelet analysis. — Chris Mueller, Nov 10 '16 at 22:27
Peak detection in noisy data is not trivial,and from your results i'd day your algorithm already does a good job. A simple solution might be to implement a rule that two peaks must be separated by a deep enough valley. — MB-F, Nov 11 '16 at 06:50
Thank you both for your suggestions. I ended up going with Chris's solution, as it was a bit "safer" in that it rarely would miss peaks. I hadn't realized that peak detection was non-trivial, so I'm satisfied with my solution. — gabe, Nov 11 '16 at 17:05
An alternative approach that worked better for me than `find_peaks_cwt` was to filter manually the signal first convoluting it with a Gaussian window and then search for the maxima (see [this answer](https://stackoverflow.com/a/25666951/12131616)). — Puco4, Jul 20 '21 at 16:22

score 2 · Answer 1 · answered Sep 19 '19 at 21:48

Starting with SciPy version 1.1.0 you may also use the function scipy.signal.find_peaks which allows you to select detected peaks based on their topographic prominence. This function is often easier to use than find_peaks_cwt. You'll have to play around a little bit to find the optimal lower bound to pass as a value to prominence but e.g. find_peaks(..., prominence=5) will ignore the unwanted peaks in your example. This should bring you reasonably close to your goal. If that's not enough you might do your own peak selection based upon peak properties like the left_/right_bases which are optionally returned.

score 0 · Answer 2 · answered Sep 20 '19 at 10:01

I'd also recommend scipy.signal.find_peaks for what you're looking for. The other, older, scipy alternate find_peaks_cwt is quite complicated to use.

It will basically do what you're looking for in a single line. Apart from the prominence parameter that lagru mentioned, for your data either the threshold or height parameters might also do what you need.

height = 40 would filter to get all the peaks you like.

Prominence is a bit hard to wrap your head around for exactly what it does sometimes.

Finding only the "prominent" local maxima of a 1d array

2 Answers2