14

I have two audio recordings of a same signal by 2 different microphones (for example, in a WAV format), but one of them is recorded with delay, for example, several seconds.

It's easy to identify such a delay visually when viewing these signals in some kind of waveform viewer - i.e. just spotting first visible peak in every signal and ensuring that they're the same shape:


(source: greycat.ru)

But how do I do it programmatically - find out what this delay (t) is? Two digitized signals are slightly different (because microphones are different, were at different positions, due to ADC setups, etc).

I've digged around a bit and found out that this problem is usually called "time-delay estimation" and it has myriads of approaches to it - for example, one of them.

But are there any simple and ready-made solutions, such as command-line utility, library or straight-forward algorithm available?

Conclusion: I've found no simple implementation and done a simple command-line utility myself - available at https://bitbucket.org/GreyCat/calc-sound-delay (GPLv3-licensed). It implements a very simple search-for-maximum algorithm described at Wikipedia.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
GreyCat
  • 16,622
  • 18
  • 74
  • 112

3 Answers3

14

The technique you're looking for is called cross correlation. It's a very simple, if somewhat compute intensive technique which can be used for solving various problems, including measuring the time difference (aka lag) between two similar signals (the signals do not need to be identical).

If you have a reasonable idea of your lag value (or at least the range of lag values that are expected) then you can reduce the total amount of computation considerably. Ditto if you can put a definite limit on how much accuracy you need.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • Yes, cross-correlation, exactly. Good for mentioning the computation can be reduced if a good starting point can be guesstimated. – Dan Feb 11 '11 at 13:41
  • 1
    I digged around and found no simplistic implementations of this algorithm, so I've made one myself and published it https://bitbucket.org/GreyCat/calc-sound-delay – GreyCat Feb 14 '11 at 13:10
  • Cross-correlation is a lot faster if you use an FFT. https://gist.github.com/376572 – endolith Apr 09 '12 at 21:50
  • Intuition for convolution and cross-correlation: https://www.youtube.com/watch?v=MQm6ZP1F6ms – XMB5 Jun 29 '21 at 02:31
  • Is it possible to use this technique for acoustic signals – Aniiya0978 Sep 12 '21 at 15:58
  • @Aniiya0978: yes, audio signals are a very common use case. – Paul R Sep 13 '21 at 20:16
2

Having had the same problem and without success to find a tool to sync the start of video/audio recordings automatically, I decided to make syncstart (github).

It is a command line tool. The basic code behind it is this:

import numpy as np
from scipy import fft
from scipy.io import wavfile
r1,s1 = wavfile.read(in1)
r2,s2 = wavfile.read(in2)
assert r1==r2, "syncstart normalizes using ffmpeg"
fs = r1
ls1 = len(s1)
ls2 = len(s2)
padsize = ls1+ls2+1
padsize = 2**(int(np.log(padsize)/np.log(2))+1)
s1pad = np.zeros(padsize)
s1pad[:ls1] = s1
s2pad = np.zeros(padsize)
s2pad[:ls2] = s2
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
    file,offset = in2,(padsize-xmax)/fs
else:
    file,offset = in1,xmax/fs
Roland Puntaier
  • 3,250
  • 30
  • 35
  • Roland I'm currently testing your code and there's a mistake there. In the last if statement you're calling the variables "in1 or in2" but they are not defined anywhere – little_mice Mar 26 '21 at 11:35
  • That would be the file names. See the [github version](https://github.com/rpuntaie/syncstart/blob/main/syncstart.py#LC194). – Roland Puntaier Mar 27 '21 at 20:03
1

A very straight forward thing todo is just to check if the peaks exceed some threshold, the time between high-peak on line A and high-peak on line B is probably your delay. Just try tinkering a bit with the thresholds and if the graphs are usually as clear as the picture you posted, then you should be fine.

Roy T.
  • 9,429
  • 2
  • 48
  • 70