Using PCM samples as input for DFT

Question

I am writing an application which will compute the DFT (using a FFT algorithm) of a sound signal. The inputs I have for the FFT algorithm are PCM samples - namely, I have a large list of 16-bit unsigned integers.

I am aware of the fact that I will need to compute the DFT of several segments of the sound signal independently using a window function, and I have already written working code which decodes an input sound file to raw PCM samples.

My question is about the definition of the DFT given on Wikipedia:

The DFT is supposed to perform an invertible, linear transformation on the inputs x(0), x(1), ..., x(N-1), where each x(n) is a complex number. However, I do not understand how to take my decoded sample integers, and turn them into complex numbers suitable for the algorithm.

I have seen certain examples online where each sample is divided to obtain a floating-point value in the range [0, 1), and then the imaginary part is set to 0.

Is this scaling down to [0, 1) necessary? And is representing each sample as x + 0i where x is the sample value correct?

yes imaginary part = 0 for all input values and no scaling to < 0 , 1 > is not necessary. beware that there are many DFT implementations with different scaling factors so check the magnitude or power of the output signal and rescale to your needs (to avoid overflows ...) I usually use normalized DFT which does not change the magnitude of signal — Spektre, May 24 '14 at 07:39
also look here: http://stackoverflow.com/a/21658139/2521214 in one of the comments is link to mine win32 soundcard oscilloscope/spectrum analyser and generator so you can compare your results with it... — Spektre, May 24 '14 at 07:47

pentadecagon · Accepted Answer · 2014-05-24T09:16:21.140

2

Yes, you can create complex numbers by adding an imaginary part of 0 to each real value. Try that, it will work. However, you just doubled the amount of data to process and you created a lot of redundancy. You can notice the redundancy in the output: The resulting coefficients for the positive frequencies and for the negative frequencies will be identical, except for a different sign of the imaginary part. So to improve efficiency and reduce redundancy one typically uses a different transformation to turn N real values into N/2 complex values, and as a result you get (roughly) N/2 frequencies. I won't go into details here, but a nice implementation of both the complex FFT and the transformation for real input can be found here: http://sourceforge.net/projects/kissfft/

About your final question: No. You don't need to scale your input. The DFT is a linear transformation, so scaled input simply results in identically scaled output.

EDIT: BTW, are you sure it's a complex DFT what you want? For real data, in particular for PCM data, you should consider the Cosine Transform instead, which maps directly from real input data to real output.

edited May 24 '14 at 09:16

answered May 24 '14 at 09:09

pentadecagon

4,717
2
18
26

This is a great answer. :) Exactly the information I wasn't clear on. My goal is to produce something like this: http://en.wikipedia.org/wiki/File:Spectrogram-19thC.png, and from what I've read one should use the Short-time Fourier transform (which seems to simply compute the DFT on several small segments of the input signal). However, since the DCT-II "is exactly equivalent (up to an overall scale factor of 2) to a DFT of 4N real inputs of even symmetry where the even-indexed elements are zero", perhaps I can use it instead of the DFT, and apply the same windowing technique? – CmdrMoozy May 24 '14 at 16:01
I'd use a normal DFT, resulting in complex frequency components. The norm of those complex coefficients would be the energy at the respective frequencies. The correct post-processing and interpretation of the result from a DCT is more involved. See the dotty artifacts in that example image? Could be the result from incorrect post-processing. You wouldn't want that. But of course, if you are careful the results from a DCT could be very good as well. For perfect performance you might even consider the MDCT here: http://en.wikipedia.org/wiki/MDCT – pentadecagon May 24 '14 at 16:43
Great, thanks for the help! I'll do some more reading online, and inspect the source code of KissFFT. :) – CmdrMoozy May 24 '14 at 17:36

Using PCM samples as input for DFT

1 Answers1