How do I interpret audio encoded binary data?

Question

I have built a little program that encodes binary data into a sound. For example the following binary input:

00101101

will produce a 'sound' like this:

################..S.SS.S################

where each character represents a constant unit of time. # stands for a 880 Hertz sine wave which is used to determine start and end of transmission, . stands for silence, representing the zeroes, and S stands for a 440 Hertz sine wave, representing the ones. Obviously, the part in the middle is much longer in practice.

The essence of my question is: How can I invert this operation?

The sound file is transmitted to the recipient via simple playback and recording of the sound. That means I am not trying to decode the original sound file which would be easy.

Obviously I have to analyze the recorded data with respect to frequency. But how? I have read a bit about Fourier Transform but I am quite lost here.

I am not sure where to start but I know that this is not trivial and probably requires quite some knowledge about signal processing. Can somebody point me in the right direction?

BTW: I am doing this in Ruby (I know, it's slow - it's just a proof of concept) but the problem itself is not programming language specific so any answers are very welcome.

You're describing [Audio Frequency Shift Keying.](http://en.wikipedia.org/wiki/Frequency-shift_keying#Audio_FSK) — Robert Harvey, May 04 '12 at 19:46
http://stackoverflow.com/questions/3714321/open-source-fsk-decoder-library — Robert Harvey, May 04 '12 at 19:49
ah great, why not reinvent the wheel? :) thanks for the hints! — Patrick Oscity, May 04 '12 at 19:59
Ruby is surprisingly quick for a lot of tasks. It won't keep up with C++ or C or assembly, but your development time should be a lot faster so the job is running sooner. And, depending on your chain for data movement, it might be plenty fast. It surprises us a lot. — the Tin Man, May 04 '12 at 20:20
I doubt that using a frequency and its double is a good idea (because of harmonics). But for a proof of concept, it can be ok. However, it sounds a little crazy to program a FSK decoder if you don't know about Fourier transforms... — leonbloy, May 07 '12 at 02:14

VMMF · Accepted Answer · 2015-08-18T17:21:25.150

2

Your problem is clearly trying to demodulate an FSK modulated signal. I would recommend implementing a correlation bank tuned to each frequency, it is a lot faster than fft if speed is one of your concerns

edited Aug 18 '15 at 17:21

answered Aug 18 '15 at 12:43

VMMF

906
1
17
28

1

Thanks for pointing that out. This is actually the approach I ended up using. I also picked 1200 and 2200 Hz, which are the frequencies used by Packet Radio. – Patrick Oscity Aug 18 '15 at 16:06
1

You're welcome! Are you actually transmitting data through sound travelling on air (from speaker to microphone) ? I have implemented an 8fsk that is able to transmit 200 bps 1m away. Im using one speaker from a Creative T6300 (5.1 system) and several smartphones. – VMMF Aug 18 '15 at 17:47
Yes, this was part of my bachelors thesis. I built an iphone app to transmit files over mic/speaker. Now I recall that I increased the transmission speed even more to 2400 baud so I needed to switch to 2400/4400 Hz. Here's a video of the finished app if you're interested: https://vimeo.com/48487024 – Patrick Oscity Aug 18 '15 at 20:54
I don´t have any experience with iphones, yet with android the maximum audio samplig frequency I´ve seen is 48kHz. Which is iphone´s? In order to transmit 2400 bps, each fsk symbol must be of only 20 samples. I couldn´t see the video (do you have it in google play or youtube?) but i saw some pictures in patrickoscity.de/projects/rx-tx. Is it only iphone to iphone or also pc to iphone? How far apart can the devices be? What kind of synchronization and error correction scheme did you implemented? Why 2400 and 4400 carriers that are still audible, why not making it with 19 and 21 khz? . Thanks – VMMF Aug 19 '15 at 13:48

score 1 · Answer 2 · answered May 05 '12 at 01:59

1

If you know the frequencies and the modulation rate, you can try using 2 sliding Goertzel filters for FSK demodulation.

answered May 05 '12 at 01:59

hotpaw2

70,107
14
90
153

How do I interpret audio encoded binary data?

2 Answers2