I am using a simple approach to find out the musical note using FFT in python steps involved are:
- Reading the sound file(.wave)
- Detecting silence in the file(by computing square sum of squared elements of input falling within the window)
- Detecting the location of notes using data obtained from (2)
- Calculating the frequency of each detected note by using DFT
- Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.
but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz) is there any fundamental error in my approach?
UPDATE: how can I pass my audio.wav file to this frequency estimator?
the complete python code is here
window_size = 2000 # Size of window to be used for detecting silence
beta = 1 # Silence detection parameter
max_notes = 100 # Maximum number of notes in file, for efficiency
sampling_freq = 44100 # Sampling frequency of audio signal
threshold = 200
# traversing sound_square array with a fixed window_size
while(i<=len(sound_square)-window_size):
s = 0.0
j = 0
while(j<=window_size):
s = s + sound_square[i+j]
j = j + 1
# detecting the silence waves
if s < threshold:
if(i-k>window_size*4):
dft = np.array(dft) # applying fourier transform function
dft = np.fft.fft(sound[k:i])
dft = np.argsort(dft)
if(dft[0]>dft[-1] and dft[1]>dft[-1]):
i_max = dft[-1]
elif(dft[1]>dft[0] and dft[-1]>dft[0]):
i_max = dft[0]
else :
i_max = dft[1]
# claculating frequency
frequency.append((i_max*sampling_freq)/(i-k))
dft = []
k = i+1
i = i + window_size