Computing the discrete fourier transform of audio data with FFTW

Question

I am quite new to signal processing so forgive me if I rant on a bit. I have download and installed FFTW for windows. The documentation is ok but I still have queries.

My overall aim is to capture raw audio data sampled at 44100 samps/sec from the sound card on the computer (this task is already implemented using libraries and my code), and then perform the DFT on blocks of this audio data.

I am only interested in finding a range of frequency components in the audio and I will not be performing any inverse DFT. In this case, is a real to real transformation all that is necessary, hence the fftw_plan_r2r_1d() function?

My blocks of data to be transformed are 11025 samples long. My function is called as shown below. This will result in a spectrum array of 11025 bins. How do I know the maximum frequency component in the result?

I believe that the bin spacing is Fs/n , 44100/11025, so 4. Does it mean that I will have a frequency spectrum in the array from 0 Hz all the way up to 44100Hz in steps of 4, or up to half the nyquist frequency 22200?

This would be a problem for me as I only wish to search for frequencies from 60Hz up to 3000Hz. Is there some way to limit the transform range?

I don't see any arguments for the function, or maybe there is another way?

Many thanks in advance for any help with this.

p = fftw_plan_r2r_1d(11025, audioData, spectrum, FFTW_REDFT00, FFTW_ESTIMATE);

One question per question please. – Paul R Oct 04 '16 at 06:19 — Paul R, Oct 04 '16 at 06:19

score 3 · Answer 1 · edited May 23 '17 at 12:00

3

To answer some of your individual questions from the above:

you need a real-to-complex transform, not real-to-real
you will calculate the magnitude of the complex output bins at the frequencies of interest (magnitude = sqrt(re*re + im*im))
the frequency resolution is indeed Fs / N = 44100 / 11025 = 4 Hz, i.e. the width of each output bin is 4 Hz
for a real-to-complex transform you get N/2 + 1 output bins giving you frequencies from 0 to Fs / 2
you just ignore frequencies in which you are not interested - the FFT is very efficient so you can afford to "waste" unwanted output bins (unless you are only interested in a relatively small number of output frequencies)

Additional notes:

plan creation does not actually perform an FFT - typically you create a plan once and then use it many times (by calling fftw_execute)
for performance you probably want to use the single precision calls (e.g. fftwf_execute rather than fftw_execute, and similarly for plan creation etc)

Some useful related questions/answers on StackOverflow:

There are many more similar questions and answers which you might also want to read - search for the fft and fftw tags.

Also note that dsp.stackexchange.com is the preferred site for site for questions on DSP theory rather than actual specific programming problems.

edited May 23 '17 at 12:00

Community

1
1

answered Oct 04 '16 at 06:24

Paul R

208,748
37
389
560

Thanks Paul R. What happens if I perform a real-to-real transform instead of real to complex? I always thought that If I am only interested in the amplitudes of the frequency components , I need only the real part. – Engineer999 Oct 04 '16 at 10:41
No, you don't "only need the real part" - every frequency component has a magnitude and a phase - this is expressed as a complex number (see the second bullet point above for how to get just the *magnitude* from a complex output bin). Note that FFTW supports real-to-real transforms but these are not Discrete *Fourier* Transforms but Discrete *Hartley/Cosine/Sine* Transforms. See [FFTW documentation on real transform kinds](http://www.fftw.org/doc/Real_002dto_002dReal-Transform-Kinds.html). – Paul R Oct 04 '16 at 11:27
Thanks again for your help. I know, I should have asked this on the DSP page as it's more relating to theory rather than coding specific. Does the magnitude of the sampled time-domain data matter? Does it mean the magnitude of the output FFT bins will be higher. Also, as I have discrete audio data to transform, this will be a real-to-complex transformation. What would be an example then of complex-to- complex. When can we have time-domain data that is complex? – Engineer999 Oct 04 '16 at 20:24
1. The magnitude of the sampled time domain data only matters in that it needs to have enough range that you don't suffer from quantisation noise, but that's generally true anyway. 2. The DFT/FFT is linear, so if you double the magnitude in one domain you double it in the other. 3. Complex-to-complex is the general case and is used in many places, e.g. communications, where you often have [quadrature signals](https://en.wikipedia.org/wiki/In-phase_and_quadrature_components). – Paul R Oct 05 '16 at 07:19

Computing the discrete fourier transform of audio data with FFTW

1 Answers1

Linked