0

I suspect that there's something I'm missing in my understanding of the Fourier Transform, so I'm looking for some correction (if that's the case). How should I gather peak information from the first plot below?

The dataset is hourly data for 911 calls over the past 17 years (for a particular city).

I've removed the trend from my data, and am now removing the seasonality. When I run the Fourier transform, I get the following plot: enter image description here

I believe the dataset does have some seasonality to it (looking at weekly data, I have this pattern): enter image description here

How do I pick out the values of the peaks in the first plot? Presumably for all of the "peaks" under, say 5000 in the first plot, I may ignore the inclusion of that seasonality in my final model, but only at a loss of accuracy, correct?

Here's the bit of code I'm working with, currently:

from scipy import fftpack
fft = fftpack.fft(calls_grouped_hour.detrended_residuals - calls_grouped_hour.detrended_residuals.mean())
plt.plot(1./(17*365)*np.arange(len(fft)), np.abs(fft))
plt.xlim([-.1, 23/2]);

EDIT: After Mark Snider's initial answer, I have the following plot: enter image description here

Adding code attempt to get peak values from fft:

Do I need to convert the values back using ifft first?

fft_x_y = np.stack((fft.real, fft.imag), -1)
peaks = []
for x, y in np.abs(fft_x_y):
    if (y >= 0): 
        spipeakskes.append(x)

peaks = np.unique(peaks)
print('Length: ', len(peaks))
print('Peak values: ', '\n', np.sort(peaks))
alofgran
  • 427
  • 7
  • 18

1 Answers1

1
threshold = 5000
fft[np.abs(fft)<threshold] = 0

This'll give you an fft that ignores everything except the peaks. And no, I wouldn't imagine that the "noise" represents actual seasonality. The peak at fft[0] doesn't represent seasonality, either - it's a multiple of the mean of the data, so if you plan on subtracting the ifft of the peaks I wouldn't include fft[0] either unless you want your data to be centered.

If you want just the peak values and not the full fft that you can invert, you can just do this:

peaks = [np.abs(value) for value in fft if np.abs(value)>threshold]

Mark Snyder
  • 1,635
  • 3
  • 12
  • While this provides a clearer plot (added to the original question above), I still struggle to get at the actual values. I've tried to "stack" the imaginary numbers so that I can filter out those with an x value greater than 0 (since the threshold dropped values < 5000 to 0. These 85 values left don't match the plot. Does the large date range cause problems deciphering seaonality? Maybe my initial question should have been about arranging the x-axis properly? Thanks for the note on the fft[0] value...I'd learned that at one point, but must've changed my code to disregard its removal again. – alofgran Jan 28 '20 at 00:34
  • @alofgran Oh, sorry. If you want **just** the peak values and not the full `fft` that you can invert, you can just do this: `peaks = [np.abs(value) for value in fft if np.abs(value)>threshold]`. I assumed you'd want to go on to invert these peaks and subtract the result from the data. – Mark Snyder Jan 28 '20 at 00:44
  • No need to apologize; you're absolutely right in your assumption. I'm just new, and you're just a step ahead of me. It sounds instead of picking out the actual peak values and creating a sine & cosine feature for each occurrence of seasonality (like I thought I needed to do), you're telling me that I can just subtract the inverted form of the `fft` list from my detrended residuals to get the noise that I can then model? – alofgran Jan 28 '20 at 00:54
  • 1
    @alofgran Essentially, yes. There's no difference between picking out the individual peaks, converting them to sinusoids, and subtracting each of them from the overall data versus doing the ifft of all of the peaks at once and then subtracting that signal from the overall data. If you do the individual conversions correctly, the end result should be exactly the same either way. If you're interested in looking at the individual seasonal effects, you might find it worthwhile to do the individual conversions. But if you just want to get rid of them, doing them all at once is easier. – Mark Snyder Jan 28 '20 at 01:10
  • 1
    @alofgran If you **are** interested in plotting the individual sinusoids, you may wish to see this previous answer of mine: https://stackoverflow.com/a/59726152/12482432 – Mark Snyder Jan 28 '20 at 01:29