Interpret numpy.fft.fft2 output

Question

My goal is to obtain a plot with the spatial frequencies of an image - kind of like doing a fourier transformation on it. I don't care about the position on the image of features with the frequency f (for instance); I'd just like to have a graphic which tells me how much of every frequency I have (the amplitude for a frequency band could be represented by the sum of contrasts with that frequency).

I am trying to do this via the numpy.fft.fft2 function.

Here is a link to a minimal example portraying my use case.

As it turns out I only get distinctly larger values for frequencies[:30,:30], and of these the absolute highest value is frequencies[0,0]. How can I interpret this?

What exactly does the amplitude of each value stand for?
What does it mean that my highest value is in frequency[0,0] What is a 0 Hz frequency?
Can I bin the values somehow so that my frequency spectrum is orientation agnostic?

This question appears to be off-topic because it is about understanding what a Fourier transform does (try http://dsp.stackexchange.com). — Oliver Charlesworth, Jan 26 '14 at 11:34
I understand what a fft does in principle, I just don't really get the `numpy.fft.fft2` output, I would have expected a 1D array with no "null" frequency band. — TheChymera, Jan 26 '14 at 11:37

unutbu · Answer 1 · 2014-01-26T13:48:43.303

13

freq has a few very large values, and lots of small values. You can see that by plotting

plt.hist(freq.ravel(), bins=100)

(See below.) So, when you use

ax1.imshow(freq, interpolation="none")

Matplotlib uses freq.min() as the lowest value in the color range (which is by default colored blue), and freq.max() as the highest value in the color range (which is by default colored red). Since almost all the values in freq are near the blue end, the plot as a whole looks blue.

You can get a more informative plot by rescaling the values in freq so that the low values are more widely distributed on the color range.

For example, you can get a better distribution of values by taking the log of freq. (You probably don't want to throw away the highest values, since they correspond to frequencies with the highest power.)

import matplotlib as ml
import matplotlib.pyplot as plt
import numpy as np
import Image
file_path = "data"
image = np.asarray(Image.open(file_path).convert('L'))
freq = np.fft.fft2(image)
freq = np.abs(freq)

fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(14, 6))
ax[0,0].hist(freq.ravel(), bins=100)
ax[0,0].set_title('hist(freq)')
ax[0,1].hist(np.log(freq).ravel(), bins=100)
ax[0,1].set_title('hist(log(freq))')
ax[1,0].imshow(np.log(freq), interpolation="none")
ax[1,0].set_title('log(freq)')
ax[1,1].imshow(image, interpolation="none")
plt.show()

enter image description here

From the docs:

The output, analogously to fft, contains the term for zero frequency in the low-order corner of the transformed axes,

Thus, freq[0,0] is the "zero frequency" term. In other words, it is the constant term in the discrete Fourier Transform.

edited Jan 26 '14 at 13:48

answered Jan 26 '14 at 13:39

unutbu

842,883
184
1,785
1,677

`freq.ravel()` unravels the 2d array into a 1d array whereby each row is read consecutively - right? how come I don't get a second peak at 200, a third at 400 etc (as the log(freq) plot would indicate)? Also, why does hist(log(freq)) stop at 16 on the x-axis? – TheChymera Jan 26 '14 at 13:46
3

`plt.hist` is making a histogram of the values. The `x-axis` represents values of `log(freq)` and the `y-axis` represents a count of how freqently those values occur. There are no repetitive peaks because similar values are being binned together. The upper value of `16` means that the largest value in `log(freq)` is near 16. Indeed, `np.log(freq.max())` equals 14.8. – unutbu Jan 26 '14 at 13:53
(And yes, `freq.ravel()` is a 1D view of the 2D array.) – unutbu Jan 26 '14 at 13:55
got it ;) and what spatial frequency (as measured in 1/px) does the value in freq[0,1] for instance correspond to? – TheChymera Jan 26 '14 at 13:57
1

You could use [numpy.fft.fftfreq](http://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.fftfreq.html) for that. There is a 2D example [here](http://stackoverflow.com/a/14583370/190597). – unutbu Jan 26 '14 at 14:05
so the actual information about what frequency each pixel corresponds to is not included in the np.fft.fft2 output? – TheChymera Jan 26 '14 at 14:27
Right. The `freq` variable is holding DFT coefficients. (Yeah, the variable name should probably be changed.) The *location* of the coefficient in the array indicates the frequency. – unutbu Jan 26 '14 at 14:47
right - so if a coefficient is at position freq[0,1], what frequency does that indicate? – TheChymera Jan 26 '14 at 15:13
1

There are two frequencies associated with `freq[0,1]`: The frequency in the `x-direction`, and the frequency in the `y-direction`. The frequencies would be given by `np.fft.fftfreq(image.shape[0])[0], np.fft.fftfreq(image.shape[1])[1]`. Reading [this](http://students.mimuw.edu.pl/~pbechler/npdoc/reference/routines.fft.html) might help too. – unutbu Jan 26 '14 at 17:15
what is `freq` contains 0 and `log(freq)` gives `-Inf`? – user2829759 Aug 22 '16 at 07:09

Interpret numpy.fft.fft2 output

1 Answers1

Linked