7

i am working on an android application using RecognizerIntent.ACTION_RECOGNIZE_SPEECH,,, my problem is that i don't know how to create the buffer which will capture the voice that the user inputs. i read alot on stack overflow, but i just don't understand how i will include the buffer and the recognition service call back into my code. AND HOW WILL I DO PLAY BACK FOR THE CONTENTS WHICH WERE SAVED INTO THE BUFFER.

this is my code:

       public class Voice extends Activity implements OnClickListener {
   byte[] sig = new byte[500000] ;
   int sigPos = 0 ;
       ListView lv;
   static final int check =0;
   protected static final String TAG = null;

@Override
protected void onCreate(Bundle savedInstanceState) {



    // TODO Auto-generated method stub
    super.onCreate(savedInstanceState);


    setContentView(R.layout.voice);

    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
            "com.domain.app");

    SpeechRecognizer recognizer = SpeechRecognizer
            .createSpeechRecognizer(this.getApplicationContext());

    RecognitionListener listener = new RecognitionListener() {

        @Override
        public void onResults(Bundle results) {
            ArrayList<String> voiceResults = results
                    .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
            if (voiceResults == null) {
                Log.e(TAG, "No voice results");
            } else {
                Log.d(TAG, "Printing matches: ");
                for (String match : voiceResults) {
                    Log.d(TAG, match);
                }
            }
        }

        @Override
        public void onReadyForSpeech(Bundle params) {
            Log.d(TAG, "Ready for speech");
        }

        @Override
        public void onError(int error) {
            Log.d(TAG,
                    "Error listening for speech: " + error);
        }

        @Override
        public void onBeginningOfSpeech() {
            Log.d(TAG, "Speech starting");
        }

        @Override
        public void onBufferReceived(byte[] buffer) {
            // TODO Auto-generated method stub
            TextView display=(TextView)findViewById (R.id.Text1);
                    display.setText("True");


              System.arraycopy(buffer, 0, sig, sigPos, buffer.length) ;
              sigPos += buffer.length ;

        }

        @Override
        public void onEndOfSpeech() {
            // TODO Auto-generated method stub

        }

        @Override
        public void onEvent(int eventType, Bundle params) {
            // TODO Auto-generated method stub

        }

        @Override
        public void onPartialResults(Bundle partialResults) {
            // TODO Auto-generated method stub

        }

        @Override
        public void onRmsChanged(float rmsdB) {
            // TODO Auto-generated method stub

        }
    };
    recognizer.setRecognitionListener(listener);
    recognizer.startListening(intent);




    startActivityForResult(intent,check);

}

@Override
public void onClick(View arg0) {
    // TODO Auto-generated method stub

}



}
Haneen Bassam
  • 145
  • 1
  • 2
  • 8
  • You don't need `startActivityForResult` + `onActivityResult` when you're using `SpeechRecognizer`... – Kaarel May 03 '13 at 08:22
  • Since ICS, onBufferReceived is not called any more. You cannot use speech recognizer and getting audio at the same time. – Hoan Nguyen May 05 '13 at 21:50

1 Answers1

3

The Android speech recognition API (as of API level 17) does not offer a reliable way to capture audio.

You can use the "buffer received" callback but note that

RecognitionListener says about onBufferReceived:

More sound has been received. The purpose of this function is to allow giving feedback to the user regarding the captured audio. There is no guarantee that this method will be called.

buffer: a buffer containing a sequence of big-endian 16-bit integers representing a single channel audio stream. The sample rate is implementation dependent.

and RecognitionService.Callback says about bufferReceived:

The service should call this method when sound has been received. The purpose of this function is to allow giving feedback to the user regarding the captured audio.

buffer: a buffer containing a sequence of big-endian 16-bit integers representing a single channel audio stream. The sample rate is implementation dependent.

So this callback is for feedback regarding the captured audio and not necessarily the captured audio itself, i.e. maybe a reduced version of it for visualization purposes. Also, "there is no guarantee that this method will be called", i.e. Google Voice Search might provide it in v1 but then decide to remove it in v2.

Note also that this method can be called multiple times during recognition. It is not documented however if the buffer represents the complete recorded audio or only the snippet since the last call. (I'd assume the latter, but you need to test it with your speech recognizer.)

So, in your implementation you should copy the buffer into a global variable to be saved e.g. into a wav-file once the recognition has finished.

Kaarel
  • 10,554
  • 4
  • 56
  • 78
  • ok,,, but where do i put the void bufferReceived(byte[] buffer) in my code? @Kaarel – Haneen Bassam May 02 '13 at 14:34
  • @HaneenBassam Look at the 2nd link in my answer, this contains an example of how to implement the listener. In the body of `bufferReceived` you can process the byte buffer in any way you like. – Kaarel May 02 '13 at 14:39
  • please @kaarel,,, if you can see my edited code... i added the buffer like you said – Haneen Bassam May 02 '13 at 14:54
  • @HaneenBassam I've updated the answer hopefully it brings more clarity – Kaarel May 03 '13 at 09:28
  • i wanted to make sure the buffer is working , and so i added a display.setText("True") in the buffer body, so if the buffer is working, it should give a text view output "True" , but it didn't give any output. which means its not working. so what's the problem? why arent any of the RecognitionListener methods working???? @Kaarel and thank you again!! and sorry to bother you ! ): – Haneen Bassam May 05 '13 at 21:24
  • @HaneenBassam Maybe you are using a speech recognizer that does not implement these callbacks? Which speech recognizer (exact version number) are you using? You can try out Kõnele (https://code.google.com/p/recognizer-intent/) which does implement the onBufferReceived callback. – Kaarel May 06 '13 at 09:44
  • i downloaded Kõnele app but it doesn't support the English language or other languages.... and how would I know which version of speech recognizer i am using?? I use Samsung galaxy s3... ?!! @Kaarel – Haneen Bassam May 07 '13 at 15:06
  • As an end-user you can select the speech recognizer by "Settings -> Language and input -> Voice recognition" (might be different on S3). As an app developer you can override this setting (see e.g.: https://code.google.com/p/recognizer-intent/wiki/DeveloperGuide). Out of the box Kõnele supports Estonian and English (the latter only with grammars though), but the recognizer URL can be overridden to point to your own server. – Kaarel May 07 '13 at 18:43
  • To find out the version number of an app, check "Settings -> Application manager" (might be different on S3). E.g. I have Google Search v2.4.10 there, this is one possible speech recognition provider. – Kaarel May 07 '13 at 18:46
  • my application's version is v1.0 ,,, but the Google Search i have is v2.4.10 ... is that a problem? that my application version is very low? should i make it higher or something? and how would i do that? @Kaarel – Haneen Bassam May 07 '13 at 20:49
  • @HaneenBassam your application version does not matter. What matters is the version of the SpeechRecognizer component. Apparently Google Search v2.4.10 does not support onBufferReceived. The answer to your original question ("how to capture audio?") remains "this cannot be done reliably via that SpeechRecognizer-API". Let's close this topic, I don't think I can help you any further. – Kaarel May 08 '13 at 07:00
  • Related https://stackoverflow.com/questions/23047433/record-save-audio-from-voice-recognition-intent/ – Nikolay Shmyrev Sep 22 '18 at 13:59