0

I'm trying to extract the frequency of a note from an mp3 file that includes a synthesized sample of an A3 note, which should be 220 Hz.

This is part of the waveform I obtain using librosa:

Saw waveform zoom

As you can see, the wave seems to repeat itself every 400 samples. Therefore, by dividing the sampling rate, which is 22050 Hz, by 400 I should get the frequency of the waveform. However, I get 55.125 Hz instead of 220. Am I missing something or making a mistake?

EDIT: Here's the code I'm using

import librosa
from matplotlib import pyplot as plt
import numpy as np
%matplotlib notebook

y, sr = librosa.load("Simple_synth/A3-saw.mp3")

plt.figure(figsize=(18,6))
plt.plot(y[2000:3000])

note_freq = sr/400

Link to the audio file: https://www.filefactory.com/file/7aqmrvq375n9/A3-saw.mp3

Mooncake
  • 13
  • 4
  • 1
    Hi please include some code so we can help! Consider reading: [MCVE](https://stackoverflow.com/help/mcve) to get advice on constructing a good question. – FChm Mar 18 '19 at 09:14
  • So is it rather A1 note with frequency 55 Hz than A3 note with frequency 220 Hz? – Heikki Mar 18 '19 at 09:34
  • It should be an A3. It definitely doesn't sound like an A1 – Mooncake Mar 18 '19 at 11:35
  • 1
    Can you upload the audio file? – TDG Mar 18 '19 at 18:43
  • In that waveform which is not exactly sinusoidal there are also upper harmonics which affect how it sounds. – Heikki Mar 18 '19 at 20:19
  • @Heikki I understand there are higher harmonics, but shouldn't the note frequency correspond to the fundamental frequency of the wave? I uploaded the file anyway. – Mooncake Mar 18 '19 at 21:04
  • You're clearly off by a factor of `4`. Is your mp3 file stereo? if so, he samples will be frames of interleaved left and right samples. If you treat this as a mono buffer, we can account for a factor of `2` – marko Mar 18 '19 at 21:14
  • @marko Yes, it's a stereo file. However, I tried converting to mono and reloading the file and I get exactly the same result... – Mooncake Mar 18 '19 at 22:37
  • Fundamental frequency should be sinusoidal. In the sample there are several frequencies especially due to a big jump in the waveform. Ear may be lured to believe that some other harmonic than the main harmonic is the "base" note, especially with head phones which are bad at repeating very low frequencies. – Heikki Mar 19 '19 at 08:24

1 Answers1

0

For the given audio sample

import librosa
from matplotlib import pyplot as plt
import numpy as np

y, sr = librosa.load("A3-saw.mp3")

it is possible to calculate fourier transform (see how to extract frequency associated with fft values in python)

# calculate fast fourier transform
w = np.fft.fft(y)

# frequencies associated to the fourier transform
freqs = np.fft.fftfreq(len(y))

And then find the highest peak in the fourier transform and its frequency in Hz

idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * sr)
print(freq_in_hertz)

54.90196078431373

There are also higher harmonics involved in the sample, which can be seen by plotting more peaks

plt.plot(sr*freqs[0:500],abs(w[0:500]))

enter image description here

plt.plot(sr*freqs[0:2000],abs(w[0:2000]))

enter image description here

Heikki
  • 2,214
  • 19
  • 34