Understanding of the Audio Recorder read() buffer

Question

I really hope someone could help me. I am currently working with the Android "Audio Recorder" and everything works fine. The next step would be to work with the bytes I got back from the buffer when I called read().

Being a Web Developer for quit some time, I lack some basics - mostly about the bytes stored in there. I would really like to understand, "what" the bytes are I got back from the method. It really seems I need some fundamentals, mostly how to analyse the stuff in there (I want to find out if there was any sound and how loud it was and not only code, I really want to understand what happens there).

Would someone be so kind to give me links to articles/blogs/books I could read to gain some more knowledge about this audio analysing?

visit http://www.tutorialspoint.com/android/android_audio_capture.htm https://github.com/Uncodin/Android-AudioRecorder https://github.com/ionull/android-realtime-audio-recorder — Jitesh Upadhyay, Feb 24 '14 at 11:34

score 9 · Accepted Answer · answered Feb 24 '14 at 11:46

In my experience the Android AudioRecorder doesn't work well with bytes on many platforms. So set your recorder up to record 16-bit and use the read with a short array.

But from there its quite complicated to explain what the actual values represent. The audio you get is Pulse Code Modulated (PCM which you have probably heard of). This means that you have a fixed sample rate (say 8000Hz) and every 1/8000th you receive an amplitude. Over time these amplitudes form up the wave form that you are probably familiar with. The values in the short array are this amplitude.

If you are familiar with how a speaker works you will be aware that a magnet pushes a diaphragm forwards and backwards. The value you get represents how far forward or backwards the diaphragm is moved (The instantaneous amplitude). So in the short array 32767 represents fully forward and -32768 represents fully backwards. 0 is directly in between and this is the state the speaker will sit in when it is turned off.

To produce sound in the speaker example you need to move the diaphragm forwards and backwards. To create a 50Hz signal the diaphragm needs to move forward to backward 50 times a second. To create a 1000Hz signal it needs to move forward to backward 1000 times a second and so on. These signals can be added together to create more complex signals.

To add more complexity to it. With the short values you have 65536 discrete positions the speaker diaphragm can be moved to. This is a fundamental difference between analogue and digital audio. There is an infinite number of positions of the diaphragm in a truly analogue recording where as digital is "quantised".

Thats a very basic explanation, as anything more complex really is out of scope of a stackoverflow response. There is loads more you can read on wikipedia and other sources. Here are a couple of wikipedia links to help you get started:

http://en.wikipedia.org/wiki/PCM http://en.wikipedia.org/wiki/Quantisation_(signal_processing)

Understanding of the Audio Recorder read() buffer

1 Answers1

Linked