Peak frequencies from .wav file

Question

I have a .wav file which recorded by me when I was playing guitar notes. Then I used below program to read my .wav file data. I used Naudio library.

AudioFileReader readertest = new AudioFileReader(@"E:\song\music.wav");
int bytesnumber = (int)readertest.Length;
var buffer = new float[bytesnumber];
readertest.Read(buffer, 0, bytesnumber);

for (int i = 0; i < buffer.Length; i++)
{
    Console.Write(buffer[i] + "\n");
}

it outputs like below.(part of output).

       0.00567627
       0.007659912
       0.005187988
       0.005706787
       0.005218506
       0.003051758
       0.004669189
       0.0007324219
       0.004180908
      -0.001586914
       0.00402832
      -0.003479004
       0.003143311
      -0.004577637
       0.001037598
      -0.005432129
      -0.001800537
      -0.005157471

I'm confused about what this output data contains. I want to take peak frequencies where the notes are played. How can I convert the above data to frequencies?

This question is too broad. The simple answer is that you need to use a Fast Fourier Transform (FFT). Beyond that, this is a large and complex field. Suggest you start studying. See : http://stackoverflow.com/q/170394/327083 — J..., Oct 21 '15 at 15:30
See this : http://stackoverflow.com/questions/24016477/how-to-calculate-fft-using-naudio-in-realtime-asio-out AND http://stackoverflow.com/questions/18813112/naudio-fft-result-gives-intensity-on-all-frequencies-c-sharp/20414331#20414331 — PaulF, Oct 21 '15 at 15:30
thanks for replying. Can you tell me how do i convert those output values to frequencies .please tell me is there any equation or some other way to do that ? — chade, Oct 22 '15 at 14:54
How? You use an FFT... you say thanks for the replies but I don't think you actually read them. The answers to your question are there. — J..., Oct 22 '15 at 16:28

score 4 · Answer 1 · answered Oct 24 '15 at 15:12

The data you are seeing is the raw samples in floating point format. This is the waveform data that represents the audio signal. When sent to the playback device it produces the sound.

To get a frequency map you will need to pass blocks of sample data through an FFT function to get the base analysis, returned as a pair of values (X and Y) for each frequency bin. From this you can calculate the power level for the frequencies in the signal. The power function is basically 10 * Log10(Sqrt(X*X + Y*Y)) for each element in the array. (And you probably never thought you'd use Pythagoras Theorem outside of Trig class!)

The resultant array will have the same number of items in it as you passed to the FFT. Each value represents the frequency n * Fs / N where n is the offset into the array, N is the array length and Fs is that sample rate. Take the bottom half of the samples and work with those. Anything in the top half of the array will be of no use to you, so make sure your sample rate is high enough that the frequencies you are interested in are less than half the sampling rate.

The size of the buffer you pass to the FFT is going to be a trade-off between frequency resolution, response time and allowance for the windowing function. Too short a buffer will get nasty spectral bleed and your frequency resolution goes out the window, too long and it can be late recognizing the tones. And of course it needs to be a power of two for the FFT, so picking the right value is probably going to take some work. Test the various options and see which one fits best for you.

Mark has written some code for FFT visualization in the NAudioWpfDemo sample application. Have a look at the SpectrumAnalyzer custom control which contains the power function (in SpectrumAnalyzer.GetYPosLong). Also look at the SampleAggregator class which contains the sample to FFT aggregation code.

Omg thank you very much . i was looking for good answer like this . — chade, Nov 13 '15 at 18:03
@Corey "Too short a buffer will get nasty spectral bleed and your frequency resolution goes out the window" -> what do you mean by that ? Could this be the cause for frequencies being detected at around 4khz for a human voice recording ? — Saryk, Jun 16 '17 at 14:44
@Saryk The shorter your input buffer the fewer output bins you get and the more junk you will get in the output. But you'll only get values up to half the sample rate at best, so a low sample rate might be your problem. — Corey, Jun 18 '17 at 12:19

Peak frequencies from .wav file

1 Answers1

Linked