Match PCM data to previously entered data through mic or use as word recognition

Question

I am working on an application for recognizing speech from pcm data. Currently I am printing the pcm buffer.

int N = AudioRecord.getMinBufferSize(8000,AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT);

recorder = new AudioRecord(AudioSource.MIC, 8000, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10);

track = new AudioTrack(AudioManager.STREAM_MUSIC, 8000, 
                    AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT,     N*10, AudioTrack.MODE_STREAM);

            recorder.startRecording();

            /*
             * Loops until something outside of this thread stops it.
             * Reads the data from the recorder and writes it to the audio track f
             */
            while(!stopped)
            { 
                //Log.i("Map", "Writing new data to buffer");
                short[] buffer = buffers[ix++ % buffers.length];
                N = recorder.read(buffer,0,buffer.length);
                for(int i = 0; i < buffer.length; i ++) {
                    System.out.println(String.valueOf(buffer[i]));
                }
            }

I would like to either 1) have the pcm data be matched to previous pcm data or 2) have it be recognized as a word. For example. If I say 'hello' inside the mic, it turns the pcm data into the word hello and I can process according to the word or if I record a 'hello' and a 'world' in two separate buffers, then say 'hello' again, it can determine that I repeated the 'hello' and not the world. Help Please.

score 0 · Answer 1 · edited May 23 '17 at 11:48

Android has built in speech recognition. However, I don't believe it supports recorded PCM data. To use it, I believe, you must take voice input directly. See http://android-developers.blogspot.com/2010/03/speech-input-api-for-android.html to get started.

If you must have recorded data, you could use other services to do speech recognition. For an intro to some of the choices see https://stackoverflow.com/a/6351055/90236.

If you are just playing (and not building a production app), you could also try using the Google services for speech recognition that Chrome uses. You'd have to convert from PCM to FLAC. See Google's voice search speech recognition service

If you want to compare PCM buffers without doing recognition, signal processing is a deep and interesting field. Sorry, I'm too rusty to give any advice in that realm.

Match PCM data to previously entered data through mic or use as word recognition

1 Answers1