3

Can anyone advise on the correct FFT to be using (Real or Complex)? I have looked here but still have questions.

I want to do image correlation to identify the location of a sub image within a master image. I understand the basics of FFTs and iFFTs.

The plan:

  1. Perform an FFT on a master image. 512x512
  2. Take complex conjugate of sub image.
  3. Perform an FFT on the sub image. 30x30 but padded with zeros to 512x512
  4. Complex Multiply the two resulting matrixes
  5. Perform iFFT on result
  6. Even though the result should be (mostly) real, take the magnitude of resulting matrix
  7. Look for maximum value which should correspond to maximum correlation.

I am having trouble getting the results that I anticipate.

If I use the real 2d fft (vDSP_fft2dzrip), the result is in a packed format that makes it hard to use a vDSP_zvmul to multiply to two result matrixes.

If I use the complex fft (vDSP_fft2dzip), I fail to get any correlation at all.

The apple examples and most of the audio examples don't do anything with the results of the forward FFT other than do the inverse.

Can anyone help me get started with image correlation? First question...can I use the complex FFT and avoid the packed format?

Community
  • 1
  • 1
Bill Johnson
  • 380
  • 2
  • 10

1 Answers1

3

The only difference between a real and complex FFT is that the real FFT can be slightly more efficient by using a clever packing scheme that transforms a 2^n real FFT into a 2^(n-1) complex FFT. The results should be the same in both cases. So I would stick with the complex FFT for simplicity if I were you, at least until you have everything working.

Have you also taken a look at vImageConvolve_ARGB8888? It seems to do what you are trying to do, with a lot less effort :)

Tark
  • 5,153
  • 3
  • 24
  • 23
  • I am trying to identify street signs in an uncertain environment and when I tried the vImageConvolve the white areas of the image overloaded the target image. I was not sure how to normalize the convolution so decided to try the frequency domain approach. – Bill Johnson Dec 15 '12 at 15:12
  • The `vImageConvolve` functions will certainly be using the frequency domain implementation, it is the only practical way to implement this algorithm for non trivial images. Maybe you can try the float variants , or scaling your images first? – Tark Dec 15 '12 at 15:15
  • The documentation of the vImageConvolve says that it is doing a sum of Kernel(x,y) * Pixel(x,y) / M x N. I believe that it does the convolution in the spacial mode. It slides the target over the big image and calculates the average for each spot. The white areas override and detail areas. I did try a Sobel edge detect on both images but the trees had so many edges that it did not work either. – Bill Johnson Dec 15 '12 at 15:24
  • Whether the algorithm is performed in the spatial or the frequency domain, the result will be almost exactly the same. The only difference is the length of time it takes to get there. – Tark Dec 19 '12 at 17:52
  • 1
    The original question is answered. There is no advantage in using the "real" FFT if you are not worried about speed. – Bill Johnson Dec 20 '12 at 13:28
  • @BillJohnson I tried `vImageConvolve`. But it is too slow for large images and runs `O(n^2)`. FFT based implementation could be faster at `O(nlogn)`. – kiranpradeep Apr 21 '15 at 17:56