0

I have a function that detects the three most dominant frequencies in an incoming microphone stream. I'm running into a problem where when I play an "E4" note (392 Hz) on my piano, it says that the fundamental frequency is B5 (996 Hz). There are occasionally some other issues like it saying that a C4 is a C5, but this one is glaring. When I plot a graph of the frequencies, it looks like the E is clearly the most dominant, but for some reason it still says B5.

def pitch_calculations(stream, CHUNK, RATE):
    # Read mic stream and then call struct.unpack to convert from binary data back to floats
    data = stream.read(CHUNK, exception_on_overflow=False)
    dataInt = np.array(struct.unpack(str(CHUNK) + 'h', data))
    
    # Apply a window function (Hamming) to the input data
    windowed_data = np.hamming(CHUNK) * dataInt

    # Using numpy fast Fourier transform to convert mic data into frequencies
    fft_result = np.abs(np.fft.fft(windowed_data)) * 2 / (11000 * CHUNK)
    freqs = np.fft.fftfreq(len(windowed_data), d=1.0 / RATE)
    
    # Find the indices of local maxima in the frequency spectrum
    localmax_indecies = argrelextrema(fft_result, np.greater)[0]
    
    # Get the magnitudes of the local maxima
    strong_freqs = fft_result[localmax_indecies]
    
    # Sort the magnitudes in descending order
    sorted_indices = np.argsort(strong_freqs)[::-1]
    
    # Get the indices of the three highest peaks
    top_indices = sorted_indices[:6]
    
    # Get the corresponding frequencies
    note_1_freq = abs(freqs[localmax_indecies[top_indices[0]]])
    note_2_freq = abs(freqs[localmax_indecies[top_indices[2]]])
    note_3_freq = abs(freqs[localmax_indecies[top_indices[4]]])
    
    return note_1_freq, note_2_freq, note_3_freq

Here is an image of my graph:

Fourier Graph of playing an "E4" note

Reinderien
  • 11,755
  • 5
  • 49
  • 77
celewis
  • 11
  • 2
  • 2
    Please include sample data, or at the very least an image of your spectrum. Otherwise it will be difficult to meaningfully help you. – Reinderien Jun 08 '23 at 21:03
  • 1
    What do you get when you print `strong_freqs` before and after sorting? – jared Jun 08 '23 at 22:00
  • Also, you seem to be indexing `top_indices[4]` when `top_indices` should be of length 3. Is that not giving an error? – jared Jun 08 '23 at 22:06
  • Oops, that was from a previous iteration of the function. The "top_indices = sorted_indices[:3]" line should be replaced with "top_indices = sorted_indices[:6]". The fft function was duplicating each frequency, so I took the 6 highest peaks and then indexed 0, 2, and 4 instead of 1, 2, and 3. As for the strong_freqs before and after sorting, it looks like this: strong freqs: [0.00061629 0.00064024 0.00083992 ... 0.00083992 0.00064024 0.00061629] strong freqs sorted: [2527 16 15 ... 1165 1354 1189] (argsort returns the indices that would sort the list rather than the sorted list) – celewis Jun 08 '23 at 23:44
  • 1
    Given that your input signal is real-valued, you know your spectrum is symmetric, and the second half is redundant. Why not start with removing that second half? I’m really uncomfortable with picking only the even elements after sorting. – Cris Luengo Jun 09 '23 at 02:26
  • Yeah I realize that part is very janky and bad. I've fixed that now but the issue of the function picking the local maxima in the wrong order is still present. – celewis Jun 09 '23 at 02:46
  • You said that the code was choosing 996 Hz, yet the sorted `strong_freqs` is showing 2527 Hz. Which one is it? – jared Jun 09 '23 at 02:52
  • Also, I've never used `argrelextrema`, but I've had success with [`scipy.signal.find_peaks`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html). – jared Jun 09 '23 at 02:56
  • negative frequency in graph? if `sample_rate/DFT_size` is not exactly divisible by your measured frequencies then aliasing occurs (usually converting your single frequency to 2 nearby ones). samplerate is usually not very tweak-able as it depends on HW settings but you can change the DFT size. However once non power of 2 sizes are neded you can not use DFFT anymore... see [plotting real time Data on (qwt )Oscillocope](https://stackoverflow.com/a/21658139/2521214) you can use my win32 generator/ocilloscope/spectralanalyser using soundcard linked there to cross compare your code – Spektre Jun 09 '23 at 07:20

0 Answers0