1

On one side with my Android smartphone I'm recording an audio stream using AudioRecord.read(). For the recording I'm using the following specs

  • SampleRate: 44100 Hz
  • MonoChannel
  • PCM-16Bit
  • size of the array I use for AudioRecord.read(): 100 (short array)
    • using this small size allows me to read every 0.5ms (mean value), so I can use this timestamp later for the multilateration (at least I think so :-) ). Maybe this will be obsolete if I can use cross correlation to determine the TDoA ?!? (see below)

On the other side I have three speaker emitting different sounds using the WebAudio API and the the following specs

  • freq1: 17500 Hz
  • freq2: 18500 Hz
  • freq3: 19500 Hz
  • signal length: 200 ms + a fade in and fade out of the gain node of 5ms, so in sum 210ms

My goal is to determine the time difference of arrival (TDoA) between the emitted sounds. So in each iteration I read 100 byte from my AudioRecord buffer and then I want to determine the time difference (if I found one of my sounds). So far I've used a simple frequency filter (using fft) to determine the TDoA, but this is really inacurrate in the real world.

So far I've found out that I can use cross correlation to determine the TDoA value even better (http://paulbourke.net/miscellaneous/correlate/ and some threads here on SO). Now my problem: at the moment I think I have to correlate the recorded signal (my short array) with a generated signal of each of my three sounds above. But I'm struggling to generate this signal. Using the code found at (http://repository.tudelft.nl/view/ir/uuid%3Ab6c16565-cac8-448d-a460-224617a35ae1/ section B1.1. genTone()) does not clearly solve my problem because this will generate an array way bigger than my recorded samples. And so far I know the cross correlation needs two arrays of the same size to work. So how can I generate a sample array?

Another question: is the thinking of how to determine the TDoA so far correct?

Philipp
  • 795
  • 1
  • 8
  • 22

1 Answers1

0

Here are some lessons I've learned the past days:

  • I can either use cross correlation (xcorr) or a frequency recognition technique to determine the TDoA. The latter one is far more imprecise. So i focus on the xcorr.
  • I can achieve the TDoA by appling the xcorr on my recorded signal and two reference signals. E.g. my record has a length of 1000 samples. With the xcorr I recognize sound A at sample 500 and sound B at sample 600. So I know they have a time difference of 100 sample (that can be converted to seconds depending on the sample rate).

Therefor I generate a linear chirp (chirps a better than simple sin waves (see literature)) using this code found on SO. For an easy example and to check if my experiment seems to work I save my record as well as my generated chirp sounds as .wav files (there are plenty of code example how to do this). Then I use MatLab as an easy way to calculate the xcorr: see here

Another point: "input of xcorr has to be the same size?" I'm quite not sure about this part but I think this has to be done. We can achieve this by zero padding the two signals to the same length (preferable a power of two, so we can use the efficient Radix-2 implementation of FFT) and then use the FFT to calculate the xcorr (see another link from SO)

I hope this so far correct and covers some questions of other people :-)

Community
  • 1
  • 1
Philipp
  • 795
  • 1
  • 8
  • 22