4

Using Audacity, I generated and exported two very similar chirps of 1 second each. One has a frequency of 440.00Hz, and the other has a frequency of 440.01Hz.

Using Julia, I made a short script to generate a plot of the FFT:

using WAV
using FFTW
using PyPlot

data, bps = wavread("440.01hz.wav")
plot(fft(data))

The 440.01 plot looked about how I would expect, with a big spike at that frequency: Spike at around 440 However, the same procedure repeated on the exact integer 440 file yielded this result: Squiggle line A very jagged line with no spike. And zoomed out it looks like this (x-axis goes to 44100 since that was the beats-per-second of the file): Wild graph I've repeated the procedure with several more frequencies, and it seems to always produce a good (sensible?) result when the frequency is a non-integer, and a confusing result otherwise. What problem am I running into here?

Edit:

Here are the files:

440.00Hz http://www.mediafire.com/file/n6erdh3tkzslpro/440.00hz.wav/file

440.01Hz http://www.mediafire.com/file/2au05df2aelmn9o/440.01hz.wav/file

And here's a plot of both waves (almost indistinguishable) plotted with both fft's zoomed in:

enter image description here

And zoomed out:

enter image description here

The code used to generate these is the same as the one above, but with 4 plots (440 WAV, 440 FFT, 440.01 WAV, 440.01 FFT).

Edit2:

I figured out at least part of the problem. If I first pass the fourier transform of the 440.00hz wav to the absolute value function before plotting it plot(fft(data) .|> abs), I get a correct result:

enter image description here

So I know the solution to the problem now, but not why the solution works. The question still remains: what is it about integer frequencies that produces a graph with no spikes? Or, equally valid, why do fractional frequencies produce graphs with them?

Lucien
  • 123
  • 8
  • Have you ruled out plotting problems? e.g. by checking for `maximum(data)`? – laborg Dec 06 '19 at 04:31
  • I guess I can't rule out plotting problems, but maximum(data) for both yields the same number for both: 0.8001647999511704 – Lucien Dec 06 '19 at 04:48
  • The y-axis on the first plot (fft of 440.01) shows the spike going up to 500? – laborg Dec 06 '19 at 04:53
  • `data` is the array with the raw wav data which ranges from roughly -0.8 to 0.8, not the fourier transformed plot. For that one, the maximums are ~553 and ~0.07 respectively – Lucien Dec 06 '19 at 05:02
  • Did you plot the two signals in Julia? – Cris Luengo Dec 06 '19 at 05:20
  • Plotting them shows no irregularities in the waves. They're almost identical, and only barely start to get out of sync towards the end. – Lucien Dec 06 '19 at 05:27
  • 1
    I couldn't recreate the problem using audacity (1 second, chirp 440->440 and 440.01->440.01, otherwise defaults) using Julia 1.0.5, FFTW 0.3.0 and UnicodePlots. The _real_ values at around 440 are nearly identical between the two signals. Could you check `findmax(real.(y[1:1000,1]))` where `y=fft(x)` ? – laborg Dec 06 '19 at 06:19
  • 1
    Because this is rather surprising, we all suspect either user error or file corruption. This is why I asked about plotting the signals. I would suggest you make a script that loads the two signals and plots them, as well as their FFTs, into subplots of the same figure. Don’t zoom in or otherwise modify the figures. Show the script and the result of running the script. – Cris Luengo Dec 06 '19 at 06:34
  • 1
    I suggest that you to upload your .wav files somewhere with info about exact sampling rate. – Moritz Schauer Dec 06 '19 at 06:47
  • 1
    I've edited the post to include the audio files now. findmax(real.(y[1:1000,1])) for 440 is (0.001139456477261394, 655) and for 440.01 is (553.9923751934822, 441) – Lucien Dec 06 '19 at 13:50
  • 2
    FFT produces complex number results, not real number results. Plot is only plotting the real part, which is zero for integer Hz, because of the length of your .wav file and because it start at 0. – Matt Timmermans Dec 07 '19 at 03:47
  • @MattTimmermans It sounds like you know the solution to the question then. Why is it that integers produce a negligible real part and what does that have to do with the length of the file and where it starts? – Lucien Dec 07 '19 at 14:26

1 Answers1

1

The (real) FFT decomposes your signal into a sum of sinusoidal components.

For each frequency you get a complex number. (ignoring the negative frequencies for now) The real part gives the cosine component, and the imaginary part gives the sine component.

You are making a .wav file with a sine wave in it, so you only get sine components, but you're plotting the real components so they're all 0.

Except... The FFT considers your signal to be periodic. When you use an arbitrary frequency, you don't end up with an integer number of cycles in the file, so there is a discontinuity when it wraps around from the end to the start.

Since your signal is not a perfect sinusoid in that case, you get some energy in the cosine components.

--

What you're doing with this FFT is probably very far from what you want to do. If you ask a question about how to do what you're really trying to do, we might be able to help.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • Well, I think what I'm doing with the FFT is exactly what I want to be doing, but you can judge for yourself: I'm using to calculate spectral flux in an audio file. Thank you for the answer by the way. – Lucien Dec 07 '19 at 15:30
  • That would normally be computed by comparing successive segments of a spectrogram. You could use this, for example: https://juliadsp.github.io/DSP.jl/stable/periodograms/#DSP.Periodograms.spectrogram – Matt Timmermans Dec 07 '19 at 22:56