extracting phase information using numpy fft

Question

I am trying to use a fast fourier transform to extract the phase shift of a single sinusoidal function. I know that on paper, If we denote the transform of our function as T, then we have the following relations:

However, I am finding that while I am able to accurately capture the frequency of my cosine wave, the phase is inaccurate unless I sample at an extremely high rate. For example:

import numpy as np
import pylab as pl

num_t = 100000
t = np.linspace(0,1,num_t)
dt = 1.0/num_t
w = 2.0*np.pi*30.0
phase = np.pi/2.0

amp = np.fft.rfft(np.cos(w*t+phase))
freqs = np.fft.rfftfreq(t.shape[-1],dt)

print (np.arctan2(amp.imag,amp.real))[30]

pl.subplot(211)
pl.plot(freqs[:60],np.sqrt(amp.real**2+amp.imag**2)[:60])
pl.subplot(212)
pl.plot(freqs[:60],(np.arctan2(amp.imag,amp.real))[:60])
pl.show()

Using num=100000 points I get a phase of 1.57173880459.

Using num=10000 points I get a phase of 1.58022110476.

Using num=1000 points I get a phase of 1.6650441064.

What's going wrong? Even with 1000 points I have 33 points per cycle, which should be enough to resolve it. Is there maybe a way to increase the number of computed frequency points? Is there any way to do this with a "low" number of points?

EDIT: from further experimentation it seems that I need ~1000 points per cycle in order to accurately extract a phase. Why?!

EDIT 2: further experiments indicate that accuracy is related to number of points per cycle, rather than absolute numbers. Increasing the number of sampled points per cycle makes phase more accurate, but if both signal frequency and number of sampled points are increased by the same factor, the accuracy stays the same.

What happens if you increase the number of points? 1,000,000? 10,000,000? Does the phase asymptotically get closer to pi/2? — wallyk, Mar 15 '16 at 18:37
I'm no expert, but there is (inevitably) increasing noise in the FFT frequency and phase results as samplng rate reduces.lower sampling rates - that's appearing as a spread in the frequency component magnitude and shift in the phase of the main component. You'll also see reduction in the magnitude of the main frequency in the result, as the other components increase. — DisappointedByUnaccountableMod, Mar 15 '16 at 18:38
Yes, as number of points increases, the phase gets arbitrarily accurate. I should add that it's not the absolute number of points that seems to be important, rather, it's the number of points PER CYCLE that makes it more accurate. Sampling a 30Hz signal at 1kHz for 10 seconds give the same accuracy as sampling at 1kHz for 1 second, but sampling a 3Hz wave at 1kHz for 1 second gives the same accuracy as sampling a 30Hz wave at 10kHz for 1 second. — KBriggs, Mar 15 '16 at 18:39
What happens when you add quantisation to e.g. 8 or 12 bits in the samples? — DisappointedByUnaccountableMod, Mar 15 '16 at 18:41
I'm not sure I understand. What do you mean by quantize...bits? — KBriggs, Mar 15 '16 at 18:42

Ilja · Accepted Answer · 2016-03-15T18:57:57.280

5

Your points are not distributed equally over the interval, you have the point at the end doubled: 0 is the same point as 1. This gets less important the more points you take, obviusly, but still gives some error. You can avoid it totally, the linspace has a flag for this. Also it has a flag to return you the dt directly along with the array.

Do

t, dt = np.linspace(0, 1, num_t, endpoint=False, retstep=True)

instead of

t = np.linspace(0,1,num_t)
dt = 1.0/num_t

then it works :)

edited Mar 15 '16 at 18:57

answered Mar 15 '16 at 18:52

Ilja

2,024
12
28

Ah, so my dt was (slightly) wrong by a factor of 1/N? It does seem to work... That's mildly embarrassing - need to RTFM on linspace apparently. Thanks! – KBriggs Mar 15 '16 at 19:02
No, your dt was not the problem, this was just for convenience. The problem is, that you consider one point twice, in the period which ends there and in the next that begins there – Ilja Mar 15 '16 at 19:09
But then how could I get accurate real world data? If I take a real time-series I can't guarantee that the last point doesn't correspond to the first one, can I? – KBriggs Mar 15 '16 at 19:51

hotpaw2 · Answer 2 · 2016-03-15T23:05:16.290

1

The phase value in the result bin of an unrotated FFT is only correct if the input signal is exactly integer periodic within the FFT length. Your test signal is not, thus the FFT measures something partially related to the phase difference of the signal discontinuity between end-points of the test sinusoid. A higher sample rate will create a slightly different last end-point from the sinusoid, and thus a possibly smaller discontinuity.

If you want to decrease this FFT phase measurement error, create your test signal so the your test phase is referenced to the exact center (sample N/2) of the test vector (not the 1st sample), and then do an fftshift operation (rotate by N/2) so that there will be no signal discontinuity between the 1st and last point in your resulting FFT input vector of length N.

edited Mar 15 '16 at 23:05

answered Mar 15 '16 at 20:22

hotpaw2

70,107
14
90
153

Could you give a short example of this? In a real-world example where I have real-time data and cannot necessarily control the periodicity, how can I extract a meaningful phase value? – KBriggs Mar 15 '16 at 22:45
For arbitrary signals, just reference your FFT phase result to the center of the FFT vector, not to any end-point. You can do this either by using a fftshift pre-processing step, or by post processing: depending on whether the bin number is odd or even, you may have to flip the sign of the phase (which is an fftshift in the other domain). – hotpaw2 Mar 15 '16 at 23:08
In the "real-world" where you can't control periodicity in FFT aperture, you would usually window the data, and thus definitely want to measure phase near the center of the hump (of the window function). There's no signal to measure the phase of at the end-points, due to common window functions usually being near zero there. – hotpaw2 Mar 15 '16 at 23:16
Could you suggest a reference discussing this? I don't really understand what is meant by measuring phase near the middle of a window , or referencing phase to the center of the FFT vector? – KBriggs Mar 16 '16 at 02:03
Fair enough, I'll post it a bit later when I have some time to formulate it properly. Thanks. – KBriggs Mar 16 '16 at 13:16

score -1 · Answer 3 · answered May 14 '21 at 17:22

This snippet of code might help:

def reconstruct_ifft(data):
"""
In this function, we take in a signal, find its fft, retain the dominant modes and reconstruct the signal from that

Parameters
----------
data : Signal to do the fft, ifft

Returns 
-------
reconstructed_signal : the reconstructed signal

"""
N = data.size
yf = rfft(data)

amp_yf = np.abs(yf) #amplitude

yf = yf*(amp_yf>(THRESHOLD*np.amax(amp_yf)))

reconstructed_signal = irfft(yf)

return reconstructed_signal

The 0.01 is the threshold of amplitudes of the fft that you would want to retain. Making the THRESHOLD greater(more than 1 does not make any sense), will give fewer modes and cause higher rms error but ensures higher frequency selectivity. (Please adjust the TABS for the python code)

extracting phase information using numpy fft

3 Answers3

Linked