7

I analyzed the sunspots.dat data (below) using fft which is a classic example in this area. I obtained results from fft in real and imaginery parts. Then I tried to use these coefficients (first 20) to recreate the data following the formula for Fourier transform. Thinking real parts correspond to a_n and imaginery to b_n, I have

import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x):
    total = 0
    for i in range(20):
        total += Y.real[i]*np.cos(i*x) + Y.imag[i]*np.sin(i*x)
    return total

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
n=len(Y)
print n

xs = linspace(0, 2*pi,1000)
gplt.plot(xs, [f(Y, x) for x in xs], '.')
gplt.show()       

For some reason however, my plot does not mirror the one generated by ifft (I use the same number of coefficients on both sides). What could be wrong ?

Data:

http://linuxgazette.net/115/misc/andreasen/sunspots.dat

mtrw
  • 34,200
  • 7
  • 63
  • 71
BBSysDyn
  • 4,389
  • 8
  • 48
  • 63
  • 2
    Just out of curiosity, what are you doing with the spectrum? If you're trying to determine relative spectral amplitudes of various components, you might want to use a data window (en.wikipedia.org/wiki/Window_function). For instance, if you plot `np.abs(fft(wolfer*hanning(len(wolfer))))` the peak around n=30 shows a little more structure than without the window. – mtrw Dec 15 '10 at 17:01

2 Answers2

14

When you called fft(wolfer), you told the transform to assume a fundamental period equal to the length of the data. To reconstruct the data, you have to use basis functions of the same fundamental period = 2*pi/N. By the same token, your time index xs has to range over the time samples of the original signal.

Another mistake was in forgetting to do to the full complex multiplication. It's easier to think of this as Y[omega]*exp(1j*n*omega/N).

Here's the fixed code. Note I renamed i to ctr to avoid confusion with sqrt(-1), and n to N to follow the usual signal processing convention of using the lower case for a sample, and the upper case for total sample length. I also imported __future__ division to avoid confusion about integer division.

forgot to add earlier: Note that SciPy's fft doesn't divide by N after accumulating. I didn't divide this out before using Y[n]; you should if you want to get back the same numbers, rather than just seeing the same shape.

And finally, note that I am summing over the full range of frequency coefficients. When I plotted np.abs(Y), it looked like there were significant values in the upper frequencies, at least until sample 70 or so. I figured it would be easier to understand the result by summing over the full range, seeing the correct result, then paring back coefficients and seeing what happens.

from __future__ import division
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x, N):
    total = 0
    for ctr in range(len(Y)):
        total += Y[ctr] * (np.cos(x*ctr*2*np.pi/N) + 1j*np.sin(x*ctr*2*np.pi/N))
    return real(total)

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
N=len(Y)
print(N)

xs = range(N)
gplt.plot(xs, [f(Y, x, N) for x in xs])
gplt.show()
U3.1415926
  • 812
  • 12
  • 30
mtrw
  • 34,200
  • 7
  • 63
  • 71
  • For the first 20, I just changed the line to "for ctr in range(20)", and that fits perfectly with ifft with the same number of coefficients. Your len(Y) obviously uses the entire thing and that fits perfectly with the data. Very cool. Thanks. – BBSysDyn Dec 15 '10 at 17:14
  • 1
    You might want to use both the first 20 and last 20. If you plot `abs(Y)`, you'll see the coefficients are symmetric. But if you `print` the values, you'll see that they're actually complex conjugates of each other. This is due to the FFT's Hermitian symmetry for real data. The upshot is you won't get back a real answer without using both the low and high frequency coefficient. – mtrw Dec 15 '10 at 17:23
  • 1
    It's going to take weeks to digest this :) Thanks again. – BBSysDyn Dec 15 '10 at 17:45
0

The answer from mtrw was extremely helpful and helped me answer the same question as the OP, but my head almost exploded trying to understand the nested loop.

Here's the last part but with numpy broadcasting (not sure if this even existed when the question was asked) rather than calling the f function:

xs = np.arange(N)
omega = 2*np.pi/N
phase = omega * xs[:,None] * xs[None,:]
reconstruct = Y[None,:] * (np.cos(phase) + 1j*np.sin(phase))
reconstruct = (reconstruct).sum(axis=1).real / N

# same output
plt.plot(reconstruct)
plt.plot(wolfer)
SuperCodeBrah
  • 2,874
  • 2
  • 19
  • 33