//EDIT...
I'm editing my question slightly to address the issue of working specifically with non-power-of-two images. I've got a basic structure that works with square grayscale images with sizes like 256x256 or 1024x1024, but can't see how to generalize to arbitrarily sized images. The fft functions seem to want you to include the log2 of the width and height, but then its unclear how to unpack the resulting data, or if the data isn't just getting scrambled. I suppose the obvious thing to do would be to center the npot image within a larger, all black image and then ignore any values in those positions when looking at the data. But wondering if there's a less awkward way to work with npot data.
//...END EDIT
I'm having a bit of trouble with the Accelerate Framework documentation. I would normally use FFTW3, but I'm having trouble getting that to compile on an actual IOS device (see this question). Can anybody point me to a super simple implementation using Accelerate that does something like the following:
1) Turns image data into an appropriate data structure that can be passed to Accelerate's FFT methods.
In FFTW3, at its simplest, using a grayscale image, this involves placing the unsigned bytes into a "fftw_complex" array, which is simply a struct of two floats, one holding the real value and the other the imaginary (and where the imaginary is initialized to zero for each pixel).
2) Takes this data structure and performs an FFT on it.
3) Prints out the magnitude and phase.
4) Performs an IFFT on it.
5) Recreates the original image from the data resulting from the IFFT.
Although this is a very basic example, I am having trouble using the documentation from Apple's site. The SO answer by Pi here is very helpful, but I am still somewhat confused about how to use Accelerate to do this basic functionality using a grayscale (or color) 2D image.
Anyhow, any pointers or especially some simple working code that processes a 2D image would be extremely helpful!
\\\ EDIT \\\
Okay, after taking some time to dive into the documentation and some very helpful code on SO as well as on pkmital's github repo, I've got some working code that I thought I'd post since 1) it took me a while to figure it out and 2) since I have a couple of remaining questions...
Initialize FFT "plan". Assuming a square power-of-two image:
#include <Accelerate/Accelerate.h>
...
UInt32 N = log2(length*length);
UInt32 log2nr = N / 2;
UInt32 log2nc = N / 2;
UInt32 numElements = 1 << ( log2nr + log2nc );
float SCALE = 1.0/numElements;
SInt32 rowStride = 1;
SInt32 columnStride = 0;
FFTSetup setup = create_fftsetup(MAX(log2nr, log2nc), FFT_RADIX2);
Pass in a byte array for a square power-of-two grayscale image and turn it into a COMPLEX_SPLIT:
COMPLEX_SPLIT in_fft;
in_fft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
in_fft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
for ( UInt32 i = 0; i < numElements; i++ ) {
if (i < t->width * t->height) {
in_fft.realp[i] = t->data[i] / 255.0;
in_fft.imagp[i] = 0.0;
}
}
Run the FFT on the transformed image data, then grab the magnitude and phase:
COMPLEX_SPLIT out_fft;
out_fft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
out_fft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
fft2d_zop ( setup, &in_fft, rowStride, columnStride, &out_fft, rowStride, columnStride, log2nc, log2nr, FFT_FORWARD );
magnitude = (float *) malloc(numElements * sizeof(float));
phase = (float *) malloc(numElements * sizeof(float));
for (int i = 0; i < numElements; i++) {
magnitude[i] = sqrt(out_fft.realp[i] * out_fft.realp[i] + out_fft.imagp[i] * out_fft.imagp[i]) ;
phase[i] = atan2(out_fft.imagp[i],out_fft.realp[i]);
}
Now you can run an IFFT on the out_fft data to get the original image...
COMPLEX_SPLIT out_ifft;
out_ifft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
out_ifft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
fft2d_zop (setup, &out_fft, rowStride, columnStride, &out_ifft, rowStride, columnStride, log2nc, log2nr, FFT_INVERSE);
vsmul( out_ifft.realp, 1, SCALE, out_ifft.realp, 1, numElements );
vsmul( out_ifft.imagp, 1, SCALE, out_ifft.imagp, 1, numElements );
Or you can run an IFFT on the magnitude to get an autocorrelation...
COMPLEX_SPLIT in_ifft;
in_ifft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
in_ifft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
for (int i = 0; i < numElements; i++) {
in_ifft.realp[i] = (magnitude[i]);
in_ifft.imagp[i] = 0.0;
}
fft2d_zop ( setup, &in_fft, rowStride, columnStride, &out_ifft, rowStride, columnStride, log2nc, log2nr, FFT_INVERSE );
vsmul( out_ifft.realp, 1, SCALE, out_ifft.realp, 1, numElements );
vsmul( out_ifft.imagp, 1, SCALE, out_ifft.imagp, 1, numElements );
Finally, you can put the ifft results back into an image array:
for ( UInt32 i = 0; i < numElements; i++ ) {
t->data[i] = (int) (out_ifft.realp[i] * 255.0);
}
I haven't figured out how to use the Accelerate framework to handle non-power-of-two images. If I allocate enough memory in the setup, then I can do an FFT, followed by an IFFT to get my original image. But if try to do an autocorrelation (with the magnitude of the FFT), then my image gets wonky results. I'm not sure of the best way to pad the image appropriately, so hopefully someone has an idea of how to do this. (Or share a working version of the vDSP_conv method!)