ifft(fft(audio)) is just noise

Question

whether i just nest them (iff(fft(audio))) or try window-by-window (window the audio, do the fft, do the ifft, then invert the window, replacing zero with eps, then merge the samples back (trying abs here and there in the pipelines)) i get only noise.

i know the ifft is only inverse to the fft with infinite precision arithmetic, infinitely many samples, etc (right?) i'm working with 64bit floating point and 44kHZ sample rate. but, i would expect to be able to at least hear the original audio.

is my error practical or theoretical? i can give code, if it's a bug.

Is it a case of how you're encoding your data back to audio? `ifft` will return complex floats. Depending on the library you're using, it might just dump the memory buffer of the array out to disk as a .wav without re-casting things back to floats. That's one way to produce complete junk as output, at any rate... Try writing `ifft(fft(audio)).real` or `abs(ifft(fft(audio)))` and see if it changes anything — Joe Kington, Jan 29 '13 at 02:06
i used `scipy.io.wavfile.write()`. i had already tried both, the problem was getting the right `dtype` (see below). — sam boosalis, Jan 29 '13 at 05:37

Jaime · Accepted Answer · 2013-01-29T04:23:56.373

Building on JoeKington's comment, I have downloaded this file, and tried the following

>>> import scipy.io.wavfile
>>> rate, data = scipy.io.wavfile.read('wahoo.wav')
>>> data
array([134, 134, 134, ..., 124, 124, 124], dtype=uint8)
>>> data_bis = np.fft.ifft(np.fft.fft(data))
>>> data_bis
array([ 134. +6.68519934e-14j,  134. -4.57982480e-14j,
        134. -1.78967708e-14j, ...,  124. -2.09835513e-14j,
        124. -1.61750469e-14j,  124. -2.14867343e-14j])
>>> data_bis = data_bis.astype('uint8')
C:\Users\Jaime y Eva\Desktop\stack_exchange.py:1: ComplexWarning: Casting complex values to real discards the imaginary part
  # -*- coding: utf-8 -*-
>>> data_bis
array([134, 133, 133, ..., 123, 123, 123], dtype=uint8)
>>> scipy.io.wavfile.write('wahoo_bis.wav', rate, data_bis)

And the resulting file plays exactly the same as the original one.

So turning the return complex values into reals is only half the problem (and you may want to go with np.abs rather than data.real, as the code above implicitly does), and you then also need to recast your floating point numbers to uints of the apropriate bit-depth.

my problem was the wrong cast. i changed 'int8' to 'uint16'. since i was adding notes (read in as `.wav`) to make chords, i thought i should use the same dtype when writing the output. nope. ifft and fft are inverses, the universe makes sense again! — sam boosalis, Jan 29 '13 at 05:38

ifft(fft(audio)) is just noise

1 Answers1

Linked