0

I have sampling data obtained from an accelerometer, with the acceleration on each axis ('x', 'y' and 'z').

That data is stored as a Pandas DataFrame, with a column for each axis.

With that, I obtain the FFT like so:

import pandas as pd
from scipy import fft
from typing import Optional

def fft_raw(df: pd.DataFrame) -> pd.DataFrame:
    """Calculates the raw Fast Fourier Transform of a DataFrame

    Parameters
    ----------
    df : pd.DataFrame
        DataFrame whose FFT should be calculated

    Returns
    -------
    pd.DataFrame
        Raw FFT
    """

    fft_raw = pd.DataFrame()
    for c in df:
        fft_raw = fft_raw.join(
            pd.DataFrame(fft.rfft(np.array(df[c])), columns=[c]), how="outer"
        )
    return fft_raw


def norm_fft(fft_raw: pd.DataFrame, length: Optional[int] = None) -> pd.DataFrame:
    """Normalizes a raw FFT

    Parameters
    ----------
    fft_raw : pd.DataFrame
        Raw FFT to be normalized
    length : int, optional
        How many sample points were used for calculating the raw FFT
        It uses the length of `fft_raw` by default

    Returns
    -------
    pd.DataFrame
        Normalized FFT
    """

    if length is None:
        length = len(fft_raw)

    return 2.0 * (fft_raw.abs() / length)

It'll end up looking something like this: Example FFT

I then want to extract the peaks from such FFT.

In the example given above, the peaks for the 'z' axis are, approx.:

  2.90 Hz at 0.15 g
 54.56 Hz at 0.90 g
106.22 Hz at 0.10 g

One thing I considered doing was filtering out all frequencies whose magnitude is below a given threshold (for the example, it could be around 0.1 g), and that'd give me a waveform with just the peaks in them.

The problem with that is that I'll still have a lot of sampling points around the peaks, as it usually "rises" up to the maximum for the peak, then "declines" back to near-nothing, and that takes up multiple points, not just one.

I then thought about trying to "split" the waveform into groups of points that represent a single peak, so I could then find their max, but I'm not quite sure of an efficient way of doing that.

I was trying to find a similar question, and then came across this one, but I couldn't get it to work, even after simplifying my data back to a Numpy Array.

So I decided to ask here if someone knows of an efficient way, preferably using Pandas, to get the peaks of the FFT.

Micael Jarniac
  • 156
  • 1
  • 9

1 Answers1

0

I found a way of achieving a desirable result, based on this question, and using the PeakUtils library:

import pandas as pd
import peakutils


def find_peaks(
    df: pd.DataFrame, threshold: pd.Series, min_dist: int = 50
) -> Dict[str, pd.Series]:
    index = df.index
    df.reset_index(drop=True, inplace=True)
    all_peaks = dict()
    for c in df:
        if c in threshold:
            data = df[c]
            peaks = peakutils.indexes(
                data, thres=threshold[c], min_dist=min_dist, thres_abs=True
            )
            all_peaks[c] = pd.Series()
            for peak in peaks:
                peak_index = index[peak]
                val = data[peak]
                to_append = pd.Series([val], index=[peak_index])
                all_peaks[c] = all_peaks[c].append(to_append)

    return all_peaks


def find_fft_peaks(
    df: pd.DataFrame, threshold: pd.Series, dist_hz: float = 1
) -> Dict[str, pd.Series]:
    index = df.index
    index_interval = index[-1] - index[0]
    points_per_hz = len(index) / index_interval
    min_dist = int(points_per_hz * dist_hz)
    return find_peaks(df=df, threshold=threshold, min_dist=min_dist)

It seems to do the trick quite well, although I still find it a bit hacky.

It might not be super efficient or clean, but it'll do the job for now.

Micael Jarniac
  • 156
  • 1
  • 9