I've been working with 2D FFTs in a project of mine, and have been unable to get correct results using two different FFT libraries. At first I assumed I was using them wrong, but upon comparing against MATLAB and Linux GCC reference implementations, it now seems something sinister is going on with my compiler (MSVC 2013 express).
My test case is as follows: 256x256 complex to real IFFT, with a single bin at 255 (0,255 for X,Y notation) set to 10000.
Using AMPFFT, I get the following 2D transform:
And with FFTW, I get the following 2D transform:
As you can see, the AMPFFT version is sort of "almost" correct, but has this weird, every sample banding in it, and the FFTW version is just all over the place and out to lunch.
I took the output of the two different test versions and compared them to MATLAB (technically octave, which uses FFTW under the hood). I also ran the same test case for FFTW under Linux with GCC. Here is a slice from that set of tests of the 127th row (the row number technically doesn't matter since with my choice of bins all rows should be identical):
In this example, the octave and Linux implementations represent the correct result and follow the red line (octave was plotted black, the Linux in red, and it agreed completely with the octave). The FFTW under MSVC is plotted blue, and the AMP FFT output is plotted in magenta. As you can see, again the AMPFFT version seems almost close, but has this weird high frequency ripple in it, and the FFTW under MSVC is just a mess, with this weird "packeted" look to it.
At this stage I can only point the finger at Visual Studio, but I have no idea what's going on or how to fix it.
Here are my two test programs:
FFTW test:
//fftwtest.cpp
//2 dimensional complex-to-real inverse FFT test.
//Produces a 256 x 256 real-valued matrix that is loadable by octave/MATLAB
#include <fstream>
#include <iostream>
#include <complex>
#include <fftw3.h>
int main(int argc, char** argv)
{
int FFTSIZE = 256;
std::complex<double>* cpxArray;
std::complex<double>* fftOut;
// cpxArray = new std::complex<double>[FFTSIZE * FFTSIZE];
//fftOut = new double[FFTSIZE * FFTSIZE];
fftOut = (std::complex<double>*)fftw_alloc_complex(FFTSIZE*FFTSIZE);
cpxArray = (std::complex<double>*)fftw_alloc_complex(FFTSIZE * FFTSIZE);
for(int i = 0; i < FFTSIZE * FFTSIZE; i++) cpxArray[i] = 0;
cpxArray[255] = std::complex<double>(10000, 0);
fftw_plan p = fftw_plan_dft_2d(FFTSIZE, FFTSIZE, (fftw_complex*)cpxArray, (fftw_complex*)fftOut, FFTW_BACKWARD, FFTW_DESTROY_INPUT | FFTW_ESTIMATE);
fftw_execute(p);
std::ofstream debugDump("debugdumpFFTW.txt");
for(int j = 0; j < FFTSIZE; j++)
{
for(int i = 0; i < FFTSIZE; i++)
{
debugDump << " " << fftOut[j * FFTSIZE + i].real();
}
debugDump << std::endl;
}
debugDump.close();
}
AMPFFT test:
//ampffttest.cpp
//2 dimensional complex-to-real inverse FFT test.
//Produces a 256 x 256 real-valued matrix that is loadable by octave/MATLAB
#include <amp_fft.h>
#include <fstream>
#include <iostream>
int main(int argc, char** argv)
{
int FFTSIZE = 256;
std::complex<float>* cpxArray;
float* fftOut;
cpxArray = new std::complex<float>[FFTSIZE * FFTSIZE];
fftOut = new float[FFTSIZE * FFTSIZE];
for(size_t i = 0; i < FFTSIZE * FFTSIZE; i++) cpxArray[i] = 0;
cpxArray[255] = std::complex<float>(10000, 0);
concurrency::extent<2> e(FFTSIZE, FFTSIZE);
std::cout << "E[0]: " << e[0] << " E[1]: " << e[1] << std::endl;
fft<float, 2> m_fft(e);
concurrency::array<float, 2> outpArray(concurrency::extent<2>(FFTSIZE, FFTSIZE));
concurrency::array<std::complex<float>, 2> inpArray(concurrency::extent<2>(FFTSIZE, FFTSIZE), cpxArray);
m_fft.inverse_transform(inpArray, outpArray);
std::vector<float> outVec = outpArray;
std::copy(outVec.begin(), outVec.end(), fftOut);
std::ofstream debugDump("debugdump.txt");
for(int j = 0; j < FFTSIZE; j++)
{
for(int i = 0; i < FFTSIZE; i++)
{
debugDump << " " << fftOut[j * FFTSIZE + i];
}
debugDump << std::endl;
}
}
Both of these were compiled with stock settings on MSVC 2013 for a win32 console app, and the FFTW test was also run under Centos 6.4 with GCC 4.4.7. Both FFTW tests use FFTW version 3.3.4, and as an aside both complex to real and complex to complex plans were tested (with identical results).
Does anyone have even the slightest clue on Visual Studio compiler settings I could try to fix this problem?