2

I've been working with 2D FFTs in a project of mine, and have been unable to get correct results using two different FFT libraries. At first I assumed I was using them wrong, but upon comparing against MATLAB and Linux GCC reference implementations, it now seems something sinister is going on with my compiler (MSVC 2013 express).

My test case is as follows: 256x256 complex to real IFFT, with a single bin at 255 (0,255 for X,Y notation) set to 10000.

Using AMPFFT, I get the following 2D transform:

Broken AMPFFT example

And with FFTW, I get the following 2D transform:

Broken FFTW example

As you can see, the AMPFFT version is sort of "almost" correct, but has this weird, every sample banding in it, and the FFTW version is just all over the place and out to lunch.

I took the output of the two different test versions and compared them to MATLAB (technically octave, which uses FFTW under the hood). I also ran the same test case for FFTW under Linux with GCC. Here is a slice from that set of tests of the 127th row (the row number technically doesn't matter since with my choice of bins all rows should be identical):

FFT comparisons

In this example, the octave and Linux implementations represent the correct result and follow the red line (octave was plotted black, the Linux in red, and it agreed completely with the octave). The FFTW under MSVC is plotted blue, and the AMP FFT output is plotted in magenta. As you can see, again the AMPFFT version seems almost close, but has this weird high frequency ripple in it, and the FFTW under MSVC is just a mess, with this weird "packeted" look to it.

At this stage I can only point the finger at Visual Studio, but I have no idea what's going on or how to fix it.

Here are my two test programs:

FFTW test:

//fftwtest.cpp
//2 dimensional complex-to-real inverse FFT test.
//Produces a 256 x 256 real-valued matrix that is loadable by octave/MATLAB
#include <fstream>
#include <iostream>
#include <complex>
#include <fftw3.h>

int main(int argc, char** argv)
{
    int FFTSIZE = 256;

    std::complex<double>* cpxArray;
    std::complex<double>* fftOut;

   // cpxArray = new std::complex<double>[FFTSIZE * FFTSIZE];
    //fftOut = new double[FFTSIZE * FFTSIZE];
    fftOut = (std::complex<double>*)fftw_alloc_complex(FFTSIZE*FFTSIZE);
    cpxArray = (std::complex<double>*)fftw_alloc_complex(FFTSIZE * FFTSIZE);

    for(int i = 0; i < FFTSIZE * FFTSIZE; i++) cpxArray[i] = 0;
    cpxArray[255] = std::complex<double>(10000, 0);
    fftw_plan p = fftw_plan_dft_2d(FFTSIZE, FFTSIZE, (fftw_complex*)cpxArray, (fftw_complex*)fftOut, FFTW_BACKWARD, FFTW_DESTROY_INPUT | FFTW_ESTIMATE);
    fftw_execute(p);

    std::ofstream debugDump("debugdumpFFTW.txt");
    for(int j = 0; j < FFTSIZE; j++)
    {
        for(int i = 0; i < FFTSIZE; i++)
        {
            debugDump << " " << fftOut[j * FFTSIZE + i].real();
        }
        debugDump << std::endl;
    }
    debugDump.close();

}

AMPFFT test:

//ampffttest.cpp
//2 dimensional complex-to-real inverse FFT test.
//Produces a 256 x 256 real-valued matrix that is loadable by octave/MATLAB
#include <amp_fft.h>
#include <fstream>
#include <iostream>



int main(int argc, char** argv)
{
    int FFTSIZE = 256;

    std::complex<float>* cpxArray;
    float* fftOut;

    cpxArray = new std::complex<float>[FFTSIZE * FFTSIZE];
    fftOut = new float[FFTSIZE * FFTSIZE];

    for(size_t i = 0; i < FFTSIZE * FFTSIZE; i++) cpxArray[i] = 0;
    cpxArray[255] = std::complex<float>(10000, 0);

    concurrency::extent<2> e(FFTSIZE, FFTSIZE);
    std::cout << "E[0]: " << e[0] << " E[1]: " << e[1] << std::endl;

    fft<float, 2> m_fft(e);
    concurrency::array<float, 2> outpArray(concurrency::extent<2>(FFTSIZE, FFTSIZE));
    concurrency::array<std::complex<float>, 2> inpArray(concurrency::extent<2>(FFTSIZE, FFTSIZE), cpxArray);


    m_fft.inverse_transform(inpArray, outpArray);
    std::vector<float> outVec = outpArray;

    std::copy(outVec.begin(), outVec.end(), fftOut);

    std::ofstream debugDump("debugdump.txt");
    for(int j = 0; j < FFTSIZE; j++)
    {
        for(int i = 0; i < FFTSIZE; i++)
        {
            debugDump << " " << fftOut[j * FFTSIZE + i];
        }
        debugDump << std::endl;
    }
}

Both of these were compiled with stock settings on MSVC 2013 for a win32 console app, and the FFTW test was also run under Centos 6.4 with GCC 4.4.7. Both FFTW tests use FFTW version 3.3.4, and as an aside both complex to real and complex to complex plans were tested (with identical results).

Does anyone have even the slightest clue on Visual Studio compiler settings I could try to fix this problem?

stix
  • 1,140
  • 13
  • 36
  • Don't know if it's related, but juggling the types `fftw_complex` and `std::complex` looks dubious. You are not using any features of std::complex. Better stick to your library native type. – n. m. could be an AI Jun 12 '14 at 04:52
  • 1
    In your AMP code, I noticed the intended output buffer (fftOut) is used to initialize the input source (inpArray) for the inverse transformation as in the following definition: `concurrency::array, 2> inpArray(concurrency::extent<2>(FFTSIZE, FFTSIZE), fftOut);` this would cause undefined behavior because fftOut buffer is not filled with valid values and its size is only half of inpArray. I think cpxArray should be used to initialize inpArray. Can you try that and see if it solves your problem? – Lukasz Mendakiewicz Jun 12 '14 at 18:49
  • Whoops! Yes, that's broken, but it doesn't solve the problem. I accidentally uploaded an older version of the test with a bug in it; I was simplifying the project code where I'm seeing the problem and accidentally changed the inpArray variable name to the wrong array. I've corrected the AMPSFFT test to use the correct buffer. – stix Jun 12 '14 at 19:13

1 Answers1

1

Looking at the MSVC FFTW output in blue, it appears to be a number of sines multiplied. That is to say, there's a sine with period 64 or so, and a sine with period 4, and possibly yet another one with a similar frequency to produce a beat.

That basically means the MSVC version had at least two non-zero inputs. I suspect the cause is the typecasting, as you write a fftw_complex object through a std::complex<double> type.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 1
    As mentioned in this post: http://stackoverflow.com/questions/4214400/problem-casting-stl-complexdouble-to-fftw-complex The fftw_complex type is *supposed* to be bit-compatible with std::complex. If the MSVC version had two non-zero inputs, the output would be a combination of the two basis functions, but in the plot there is no evidence whatsoever of the "correct" basis function (the one at bin 255). It also doesn't explain what's happening with the AMPFFT test case, which presumably should be the most compatible with MSVC, nor why the FFTW case works under GCC in Linux. – stix Jun 12 '14 at 15:03
  • @stix: It would appear that MSVC indeed has 2 bins incorrectly set to non-zero values and one (255) incorrectly set to zero. Clearly the write doesn't go where you think it should go. I blame pointer mishaps. Your AMPFFT example has yet another problem: cpxArray is write-only. – MSalters Jun 12 '14 at 15:11