2

I am using a simple approach to find out the musical note using FFT in python steps involved are:

  1. Reading the sound file(.wave)
  2. Detecting silence in the file(by computing square sum of squared elements of input falling within the window)
  3. Detecting the location of notes using data obtained from (2)
  4. Calculating the frequency of each detected note by using DFT
  5. Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.

but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz) is there any fundamental error in my approach?

UPDATE: how can I pass my audio.wav file to this frequency estimator?

the complete python code is here

window_size = 2000    # Size of window to be used for detecting silence
beta = 1   # Silence detection parameter
max_notes = 100    # Maximum number of notes in file, for efficiency
sampling_freq = 44100   # Sampling frequency of audio signal
threshold = 200


 # traversing sound_square array with a fixed window_size
while(i<=len(sound_square)-window_size):
    s = 0.0
    j = 0
    while(j<=window_size):
        s = s + sound_square[i+j]
        j = j + 1   
        # detecting the silence waves
    if s < threshold:
        if(i-k>window_size*4):
            dft = np.array(dft) # applying fourier transform function
            dft = np.fft.fft(sound[k:i])
            dft = np.argsort(dft)

            if(dft[0]>dft[-1] and dft[1]>dft[-1]):
                i_max = dft[-1]
            elif(dft[1]>dft[0] and dft[-1]>dft[0]):
                i_max = dft[0]
            else :  
                i_max = dft[1]
                        # claculating frequency             
            frequency.append((i_max*sampling_freq)/(i-k))
            dft = []
            k = i+1
    i = i + window_size
John
  • 29
  • 5
  • 2
    You are assuming that the *frequency* of the highest magnitude peak in your spectrum corresponds to the *pitch* of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think. – Paul R Nov 14 '18 at 12:19
  • 1
    @PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link? – John Nov 14 '18 at 12:24
  • 1
    There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the `[fft]` tag and you should find lots of relevant info. – Paul R Nov 14 '18 at 12:37
  • 1
    @PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? https://github.com/endolith/waveform_analysis/blob/master/waveform_analysis/freq_estimation.py – John Nov 14 '18 at 13:43
  • Please adhere to the "one question per question" rule - it seems like your update should really be a separate question. – Paul R Nov 14 '18 at 15:49
  • `if s < threshold` Do you mean `>`? – Cris Luengo Nov 16 '18 at 18:43

2 Answers2

1

Pitch is not the same as peak magnitude frequency bin of an FFT. Pitch is a human psycho-acoustic phenomena. The pitch sound could have a missing or very weak fundamental (common in some voice, piano and guitar sounds) and/or lots of powerful overtones in its spectrum that overwhelm the pitch frequency (but still be heard as that pitch note by a human). So any FFT peak frequency detector (even including some windowing and interpolation, which your code does not) will not be a robust method of musical pitch estimation. An FFT will also quantize frequency to some bin resolution (perhaps coarser than your requirements) that depends on the FFT (or window) length.

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

hotpaw2
  • 70,107
  • 14
  • 90
  • 153
0

Pitch tracking is implemented in librosa.piptrack https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

Jon Nordby
  • 5,494
  • 1
  • 21
  • 50