2

I'm trying to get the correct FFT bin index based on the given frequency. The audio is being sampled at 44.1k Hz and the FFT size is 1024. Given the signal is real (capture from PyAudio, decoded through numpy.fromstring, windowed by scipy.signal.hann), I then perform FFT through scipy.fftpack.rfft, and compute the decibel of the result, in whole, magnitude = 20 * scipy.log10(abs(rfft(audio_sample)))

Based on this, and this, I originally had my mapping from the FFT bin index, k, to any frequency, F, as:

F = k*Fs/N for k = 0 ... N/2-1 where Fs is the sampling rate, and N is the FFT bin size, in this case, 1024. And the reverse as:

k = F*N/Fs for F = 0Hz ... Fs/2-Fs/N

However, realizing that the rfft's result is no symmetric like fft, and provides the result, in an N size array. I now have some questions in regarding the mapping and the function. Documentation unfortunately did not provide much information as I'm novice in this area.

My questions:

  1. To me, the result of rfft on an audio sample can be used directly from the first bin to the last bin, as no symmetry occurs in the output, is that correct?

  2. Given the lack of symmetry from the above, the frequency resolution appears to have increased, is this interpretation correct?

  3. Because of using rfft, my mapping function from bin index k to frequency F is now F = k*Fs/(2N) for k = 0 ... N-1 is this correct?

  4. Conversely, the reverse mapping function from frequency F to bin index k now becomes k = 2*F*N/Fs for F = 0Hz ... Fs/2-(Fs/2/N), what about the correctness of this?

My general confusion arises from how rfft is related to fft, and how the mapping can be done correctly while using rfft. I believe my mapping is offset by a small amount, and that is crucial in my application. Please point out the mistake or advise on the matter if possible, thank you very much.

Community
  • 1
  • 1
Fan Jin
  • 400
  • 5
  • 12
  • [scipy.fftpack.rfft](http://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.rfft.html) – Fan Jin Feb 03 '14 at 02:35

1 Answers1

1

First to clear up a few things for you:

A quick reference to the fftpack documentation reveals that rfft only gives you an output vector from 0..512 (in your case). The reason for this is exactly because of the symmetry present when calculating the discrete Fourier transform of a real-valued input: y[k] = y*[N-k] (see Wikipedia page on DFTs). Therefore, the rfft function only calculates and stores N/2+1 values since you can calculate the other half by just taking the complex conjugates (should you really want it for plotting (say)). The fft function makes no assumption on the input values (they can have both a real and imaginary part) and therefore no symmetry can be assumed in the output and it gives you a full output vector with N values. Admittedly, most applications use a real input, so people tend to assume the symmetry is always there. Note that the Fast Fourier Transform (FFT) is an (efficient) algorithm to calculate the Discrete Fourier Transform (DFT) and the rfft function also uses the FFT to do the calculation.

In light of the above, your indices for accessing the output vector are out of bounds, i.e. > 512. The reasons why/how you can do this depends on your code. You should clearly distinguish between the 'logical N' (that you use to map the bin frequencies, define the DFT etc.) and the 'computational N' (the actual number of values in your output vector), then all your problems should disappear.

To concretely answer your questions:

  1. No. There is symmetry and you need to use this to calculate the last bins (but they give you no extra information).

  2. No. The only way to increase resolution of a DFT is to increase your sample length.

  3. No, but almost. F = k*Fs/N for k = 0..N/2

  4. For an output vector with N bins you get frequencies from 0 to (N-1)/N*Fs. Using the rfft you will have an output vector with N/2+1 bins. You do the maths, but I get 0..Fs/2

Hope things are clearer now.

ObeyTheDiode
  • 193
  • 5
  • Have you actually tried using `scipy.fftpack.rfft` on a vector of length *N*? You will find that it indeed gives you a length *N* output. You are probably thinking of `np.fft.rfft`, which will give you an output of length `N // 2 + 1`. – ali_m Feb 03 '14 at 12:04
  • I have not used fftpack, nor Python, but the documentation I referenced is indeed for fftpack and does not affect the explanation of the theory, that I believe is causing the confusion. – ObeyTheDiode Feb 03 '14 at 12:18
  • As I hinted on, the reason why N elements can be accessed is not known to me (perhaps the N input values are copied and the transform run in-place), but I would not trust the values above N/2 because it contradicts the documentation and essentially the whole point of having an rfft function. – ObeyTheDiode Feb 03 '14 at 12:32
  • 1
    Take a look at the documentation again - for an even length input, the output is `[y(0),Re(y(1)),Im(y(1)),...,Re(y(n/2))]`, i.e. it seems to contain the real and imaginary components interleaved. Further down it clarifies that "`If n is not specified (the default) then n = x.shape[axis]`". It's the interpretation of this output that confuses the OP, and frankly, me too. – ali_m Feb 03 '14 at 12:34
  • OK, I understand your point. I refered to (logical) values that are complex-valued, and not the actual elements of the array (which I assumed are stored as a complex data type). Assuming the output stores (n/2 + 1) _complex values_ as n seperate _elements_ in a vector, instead of as a complex number data type, then one should just use the appropriate indices to calculate the magnitudes of the various bins. The mechanisms/syntax to access the output I cannot help with. As I said, distinguishing between the logical and computational n is the important thing. – ObeyTheDiode Feb 03 '14 at 13:15