27

I have to compare two time-vs-voltage waveforms. Because of the peculiarity of the sources of these waveforms, one of them can be a time shifted version of the other.

How can i find whether there is a time shift? and if yes, how much is it.

I am doing this in Python and wish to use numpy/scipy libraries.

mtrw
  • 34,200
  • 7
  • 63
  • 71
Vishal
  • 569
  • 2
  • 5
  • 13

6 Answers6

47

scipy provides a correlation function which will work fine for small input and also if you want non-circular correlation meaning that the signal will not wrap around. note that in mode='full' , the size of the array returned by signal.correlation is sum of the signal sizes minus one (i.e. len(a) + len(b) - 1), so the value from argmax is off by (signal size -1 = 20) from what you seem to expect.

from scipy import signal, fftpack
import numpy
a = numpy.array([0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0, 0, 0, 0, 0])
b = numpy.array([0, 0, 0, 0, 0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0])
numpy.argmax(signal.correlate(a,b)) -> 16
numpy.argmax(signal.correlate(b,a)) -> 24

The two different values correspond to whether the shift is in a or b.

If you want circular correlation and for big signal size, you can use the convolution/Fourier transform theorem with the caveat that correlation is very similar to but not identical to convolution.

A = fftpack.fft(a)
B = fftpack.fft(b)
Ar = -A.conjugate()
Br = -B.conjugate()
numpy.argmax(numpy.abs(fftpack.ifft(Ar*B))) -> 4
numpy.argmax(numpy.abs(fftpack.ifft(A*Br))) -> 17

again the two values correspond to whether your interpreting a shift in a or a shift in b.

The negative conjugation is due to convolution flipping one of the functions, but in correlation there is no flipping. You can undo the flipping by either reversing one of the signals and then taking the FFT, or taking the FFT of the signal and then taking the negative conjugate. i.e. the following is true: Ar = -A.conjugate() = fft(a[::-1])

Trevor Boyd Smith
  • 18,164
  • 32
  • 127
  • 177
Gus
  • 4,375
  • 5
  • 31
  • 50
  • 1
    Thanks for the answer. This is the first time i am seeing something that makes sense. Now one more question, depending on 'sign' of the time shift value I will either subtract or add the time shift. How to get the sign? – Vishal Jan 14 '11 at 11:47
  • 3
    Wait... why do you need the negative? I don't think you need the negative. Let x(t) have transform X(f). By time reversal, x(-t) has transform X(-f). If x(t) is real, then X(-f) = conj(X(f)). Therefore, if x(t) is real, then x(-t) has transform conj(X(f)). No negative. – Steve Tjoa Jan 14 '11 at 17:20
  • @Steve: Thanks. I made a mistake when I was deriving it last night. – Gus Jan 15 '11 at 04:15
  • Thanks for this answer - it helped me out with my problem too. – tatlar Jan 31 '12 at 18:20
  • @SteveTjoa what Vishal is noting is that signal.correlate does not assume the signals to be periodic and so returns positive or negative shift whereas the second method always return a positive shift which is ok because the signals are supposed to be periodic. – Gabriel Devillers Jun 19 '18 at 11:57
  • Took me some time to understand the results 16 and 24 from the correlation function: correlate(a,b) -> 16 means that b is shifted such that the last value of array b matches a[16] correlate(b,a)->24 means that a is shifted such that the last value of array a matches the (zero?) expanded b[24] – Marco Nov 04 '19 at 10:00
  • Need to use the gradient, respecitvely 2nd gradient as input. Otherwise when there is a ramp they might fail e.g. a= [ 0 2 4 6 8 8 8 8 8 10 12 14 16 16 16 16 16 17 18 19 20] b=[-4 -3 -2 -1 0 2 4 6 8 8 8 8 8 10 12 14 16 16 16 16 16] numpy.argmax(numpy.abs(fftpack.ifft(Ar*B))) -> 0, When applying a=numpy.gradient(a) b=numpy.gradient(b) numpy.argmax(numpy.abs(fftpack.ifft(Ar*B))) -> 4 , need to apply gradient second time then also numpy.argmax(signal.correlate(a,b)) -> 16 gives the correct result again. – Marco Nov 05 '19 at 14:03
  • To give perspective to "it works fine for **small input**", `scipy.signal.correlate()` took just 3ms on arrays of 30,000 samples on an old MacBook Air. – Matthew Walker Jun 21 '20 at 10:35
15

If one is time-shifted by the other, you will see a peak in the correlation. Since calculating the correlation is expensive, it is better to use FFT. So, something like this should work:

af = scipy.fft(a)
bf = scipy.fft(b)
c = scipy.ifft(af * scipy.conj(bf))

time_shift = argmax(abs(c))
highBandWidth
  • 16,751
  • 20
  • 84
  • 131
  • 1
    I tried doing what you've suggested, for the case in hand it gave a wrong result. Example: >>> a21 array([0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0, 0, 0, 0, 0]) >>> a22 array([0, 0, 0, 0, 0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0]) >>> fa21 = np.fft.fft(a21) >>> fa22 = np.fft.fft(a22) >>> c = np.fft.ifft(fa21 * fa22) >>> time_shift = np.argmax(abs(c)) >>> time_shift 20 As you can see, the actual time shift is 4 points and not 20. Am i missing something here? – Vishal Jan 14 '11 at 09:14
  • 1
    -1. Incorrect because `c` is simply `a` convolved with `b`, not correlated. The time reversal will mess things up and not give the desired result. – Steve Tjoa Jan 14 '11 at 14:44
  • 1
    You're right Steve. I wrote the answer as a rough idea. I have corrected it to reflect the conjugation. – highBandWidth Jan 14 '11 at 17:05
  • Thanks for the edit. (This is only true for real signals, but I guess we can assume that.) – Steve Tjoa Jan 14 '11 at 17:17
  • 2
    Is there a way to find which signal is leading? – Shashank Sawant Mar 18 '15 at 09:07
  • I think if the peak of the real Ifft output as at the beginning, signal 1 is shifted relative to signal 2. If the peak is at the end of the output, vice versa. – thomas.cloud Aug 30 '19 at 15:12
  • If you're working with stereo audio data, make sure to convert it to mono. I just spent hours debugging that one ;o) – Matthew Walker Jun 21 '20 at 09:41
9

This function is probably more efficient for real-valued signals. It uses rfft and zero pads the inputs to a power of 2 large enough to ensure linear (i.e. non-circular) correlation:

def rfft_xcorr(x, y):
    M = len(x) + len(y) - 1
    N = 2 ** int(np.ceil(np.log2(M)))
    X = np.fft.rfft(x, N)
    Y = np.fft.rfft(y, N)
    cxy = np.fft.irfft(X * np.conj(Y))
    cxy = np.hstack((cxy[:len(x)], cxy[N-len(y)+1:]))
    return cxy

The return value is length M = len(x) + len(y) - 1 (hacked together with hstack to remove the extra zeros from rounding up to a power of 2). The non-negative lags are cxy[0], cxy[1], ..., cxy[len(x)-1], while the negative lags are cxy[-1], cxy[-2], ..., cxy[-len(y)+1].

To match a reference signal, I'd compute rfft_xcorr(x, ref) and look for the peak. For example:

def match(x, ref):
    cxy = rfft_xcorr(x, ref)
    index = np.argmax(cxy)
    if index < len(x):
        return index
    else: # negative lag
        return index - len(cxy)   

In [1]: ref = np.array([1,2,3,4,5])
In [2]: x = np.hstack(([2,-3,9], 1.5 * ref, [0,3,8]))
In [3]: match(x, ref)
Out[3]: 3
In [4]: x = np.hstack((1.5 * ref, [0,3,8], [2,-3,-9]))
In [5]: match(x, ref)
Out[5]: 0
In [6]: x = np.hstack((1.5 * ref[1:], [0,3,8], [2,-3,-9,1]))
In [7]: match(x, ref)
Out[7]: -1

It's not a robust way to match signals, but it is quick and easy.

Eryk Sun
  • 33,190
  • 5
  • 92
  • 111
3

Here's another option:

from scipy import signal, fftpack

def get_max_correlation(original, match):
    z = signal.fftconvolve(original, match[::-1])
    lags = np.arange(z.size) - (match.size - 1)
    return ( lags[np.argmax(np.abs(z))] )
FFT
  • 929
  • 8
  • 17
  • Works but [seems completely equivalent](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.correlate.html) to scipy.signal.correlate() from [Gus answer](https://stackoverflow.com/a/4690225/4669135) which by default uses scipy.signal.fftconvolve as soon as its faster (i.e. as soon as quadratic times hurt which is soon). – Gabriel Devillers Jun 19 '18 at 11:43
  • fails the same as Gus answer when the data is e.g. increasing. a= [ 0 2 4 6 8 8 8 8 8 10 12 14 16 16 16 16 16 17 18 19 20] b=[-4 -3 -2 -1 0 2 4 6 8 8 8 8 8 10 12 14 16 16 16 16 16] get_max_correlation(a,b) -> 0, When applying a=numpy.gradient(a) b=numpy.gradient(b) it correctly returns get_max_correlation(a,b) -> -4 – Marco Nov 05 '19 at 13:37
3

It depends on the kind of signal you have (periodic?…), on whether both signals have the same amplitude, and on what precision you are looking for.

The correlation function mentioned by highBandWidth might indeed work for you. It is simple enough that you should give it a try.

Another, more precise option is the one I use for high-precision spectral line fitting: you model your "master" signal with a spline and fit the time-shifted signal with it (while possibly scaling the signal, if need be). This yields very precise time shifts. One advantage of this approach is that you do not have to study the correlation function. You can for instance create the spline easily with interpolate.UnivariateSpline() (from SciPy). SciPy returns a function, which is then easily fitted with optimize.leastsq().

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
  • Thanks! I just used optimize.leastsq: I had no idea this was tractable for timeshifts; much easier than a convolution approach. Do you know if there are any references for how optimize.leastsq works? I thought least-squares had to work with linear combinations of input basis functions. – Jason S Jun 04 '13 at 00:59
  • 1
    In the [documentation](http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html) one reads that “leastsq” is a wrapper around MINPACK’s lmdif and lmder algorithms." You can find more information in MINPACK's code: http://www.netlib.org/minpack/lmdif.f and http://www.netlib.org/minpack/lmder.f. – Eric O. Lebigot Jun 04 '13 at 02:58
  • This is really interesting, but I'm struggling to code it. All the examples I've found for optimize.leastsq() seem to be about extracting model coefficients from real data fit to a function. I've got the spline function from the 'master signal', but how do I fit another dataset to that and extract the time offset? – vantom May 19 '23 at 21:10
1

Blockquote

(A very late answer) to find the time-shift between two signals: use the time-shift property of FTs, so the shifts can be shorter than the sample spacing, then compute the quadratic difference between a time-shifted waveform and the reference waveform. It can be useful when you have n shifted waveforms with a multiplicity in the shifts, like n receivers equally spaced for a same incoming wave. You can also correct dispersion substituting a static time-shift by a function of frequency.

The code goes like this:

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft, ifft, fftshift, fftfreq
from scipy import signal

#  generating a test signal
dt = 0.01
t0 = 0.025
n = 512
freq = fftfreq(n, dt)

time = np.linspace(-n * dt / 2, n * dt / 2, n)
y = signal.gausspulse(time, fc=10, bw=0.3) + np.random.normal(0, 1, n) / 100
Y = fft(y)
# time-shift of 0.235; could be a dispersion curve, so y2 would be dispersive
Y2 = Y * np.exp(-1j * 2 * np.pi * freq * 0.235)  
y2 = ifft(Y2).real

# scan possible time-shifts
error = []
timeshifts = np.arange(-100, 100) * dt / 2  # could be dispersion curves instead
for ts in timeshifts:
    Y2_shifted = Y2 * np.exp(1j * 2 * np.pi * freq * ts)
    y2_shifted = ifft(Y2_shifted).real
    error.append(np.sum((y2_shifted - y) ** 2))

# show the results
ts_final = timeshifts[np.argmin(error)]
print(ts_final)

Y2_shifted = Y2 * np.exp(1j * 2 * np.pi * freq * ts_final)
y2_shifted = ifft(Y2_shifted).real

plt.subplot(221)
plt.plot(time, y, label="y")
plt.plot(time, y2, label="y2")
plt.xlabel("time")
plt.legend()

plt.subplot(223)
plt.plot(time, y, label="y")
plt.plot(time, y2_shifted, label="y_shifted")
plt.xlabel("time")
plt.legend()

plt.subplot(122)
plt.plot(timeshifts, error, label="error")
plt.xlabel("timeshifts")
plt.legend()

plt.show()

See an example here