My goal is to detect if a certain frequency is present in an audio recording and output a binary response. To do this, I plan on performing a Fourier transform on the audio file, and querying the values contained in the frequency bins. If I find that the bin associated with the frequency I am looking for has a high value, this should mean that it is present (if my thinking is correct). However, I am having trouble generating my transform correctly. My code is below:
from scipy.io import wavfile
from scipy.fft import fft, fftfreq
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
user_in = input("Please enter the relative path to your wav file --> ")
sampling_rate, data = wavfile.read(user_in)
print("sampling rate:", sampling_rate)
duration = len(data) / float(sampling_rate)
print("duration:", duration)
number_samples_in_seg = int(sampling_rate * duration)
fft_of_data = fft(data)
fft_bins_from_data = fftfreq(number_samples_in_seg, 1 / sampling_rate)
print(fft_bins_from_data.size)
plt.plot(fft_bins_from_data, fft_of_data, label="Real part")
plt.show()
Trying this code using a few different wav files leads me to wonder whether I am displaying my transform in the time domain, rather than the frequency domain, which I need:
Input: 200hz.wav
Output:
sampling rate: 48000
duration: 60.000375
2880018
Input: 8000hz.wav
Output:
sampling rate: 48000
duration: 60.000375
2880018
With these files that should contain a pure signal, I would expect to see only one spike on my plot, where x = 200 or x = 800. One final file contributes to my concern that I am not viewing the frequency domain:
Input: beep.wav
Output:
sampling rate: 48000
duration: 5.061958333333333
24297
This appears to show the distinct beeping as it progresses over an x-axis of time.
I attempted to clean up the plotting by only plotting the magnitude of the positive values. Unfortunately, I am still not seeing the frequencies isolated on a frequency spectrum:
plt.plot(fft_bins_from_data[0:number_samples_in_seg//2], abs(fft_of_data[0:number_samples_in_seg//2])
plt.show()
I have referred to these resources before posting:
How to get a list of frequencies in a wav file
Fourier Transforms With scipy.fft: Python Signal Processing
Calculate the magnitude and phase of a signal at a particular frequency in python
What is the difference between numpy.fft.fft and numpy.fft.fftfreq
A summary of my questions:
- Are my plots displaying the time domain or frequency domain of the signal?
- Why is the number of samples equal to the number of bins, and should this be the case for frequency domain?
- If these plots are indeed the frequency domain, how do I interpret them and query the values in the bins?