I am writing an application which will compute the DFT (using a FFT algorithm) of a sound signal. The inputs I have for the FFT algorithm are PCM samples - namely, I have a large list of 16-bit unsigned integers.
I am aware of the fact that I will need to compute the DFT of several segments of the sound signal independently using a window function, and I have already written working code which decodes an input sound file to raw PCM samples.
My question is about the definition of the DFT given on Wikipedia:
The DFT is supposed to perform an invertible, linear transformation on the inputs x(0), x(1), ..., x(N-1)
, where each x(n)
is a complex number. However, I do not understand how to take my decoded sample integers, and turn them into complex numbers suitable for the algorithm.
I have seen certain examples online where each sample is divided to obtain a floating-point value in the range [0, 1), and then the imaginary part is set to 0.
Is this scaling down to [0, 1) necessary? And is representing each sample as x + 0i
where x
is the sample value correct?