I think this is a hard problem to answer in general, because there are datasets where a solution might work, and there are datasets where it does not. Your data is upside down, so the first step is flipping y
to -y
, so the minimums will be interpreted as maximums (and maybe take absolute value to avoid dealing with negative numbers.)
The first option is to use scipy.signal.find_peaks
. Knowing your data here you can utilize some of the parameters: in my experience height, distance and prominence are the most useful.
There is a nice explanation about the parameters of find_peaks
.
This will correctly identify the peaks in most cases, but requires time to appopriately set the arguments.
A similar solution with scipy.signal.find_peaks_cwt
, here (in most cases) you will need to adjust the widths parameter, which is (from the docs):
1-D array of widths to use for calculating the CWT matrix. In general,
this range should cover the expected width of peaks of interest.
But again, this requires some prior knowledge about your data.
Because you have periodic data, maybe you can make use of FFT to find the characteristic frequenies to adjust the parameters inside find_peaks
and find_peaks_cwt
. Since you haven't provided the dataset I have only synthetic data to deal with. Note that I return len(peaks) - 1
because usually on boundaries there is an extra period which is counted.
import numpy as np
from scipy.signal import find_peaks, find_peaks_cwt
import matplotlib.pyplot as plt
# some generic data
x = np.linspace(0, 1000, 10000)
y = 250 + 100 * np.sin(0.08 * x) - np.random.normal(30, 20, 10000)
def count_waves_1(x, y):
peaks, props = find_peaks(y, prominence=120, height= np.max(y) / 10, distance=200)
# here you can make use of props to filter the peaks by different properties,
# for example extract only the n largest prominence peak:
#
# ind = np.argpartition(props["prominences"], -n_largest)[-n_largest:]
# peaks = peaks[ind]
plt.plot(x, y)
plt.plot(x[peaks], y[peaks], 'ro')
return len(peaks) - 1
first_solution = count_waves_1(x, y)
def count_waves_2(x, y):
peaks = find_peaks_cwt(y, widths=np.arange(100, 200))
plt.plot(x, y)
plt.plot(x[peaks], y[peaks], 'ro')
return len(peaks) - 1
second_solution = count_waves_2(x, y)
print(first_solution, second_solution)