PocketSphinx on Android recognises words even if they aren't spoken

Question

I am using PocketSphinx for android-23. I want to code an offline assistant for one of my apps. I have successfully used recognizer.addKeyphraseSearch to initialize the assistant. For eg. In this case I say "Hello" to initialize it.

this is my entire code

public class Farmax_2 extends Activity implements
        RecognitionListener {

    /* Named searches allow to quickly reconfigure the decoder */
    private static final String KWS_SEARCH = "wakeup";
    private static final String ahead = "about";
    private static final String PHONE_SEARCH = "ahead";
    private static final String MENU_SEARCH = "menu";
    TextToSpeech t1;
Button btn;
    /* Keyword we are looking for to activate menu */
    private static final String KEYPHRASE = "hello";

    /* Used to handle permission request */
    private static final int PERMISSIONS_REQUEST_RECORD_AUDIO = 1;

    private SpeechRecognizer recognizer;
    private HashMap<String, Integer> captions;

    @Override
    public void onCreate(Bundle state) {
        super.onCreate(state);
         setContentView(R.layout.activity_farmax_2);
        // Check if user has given permission to record audio
        int permissionCheck = ContextCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.RECORD_AUDIO);
        if (permissionCheck != PackageManager.PERMISSION_GRANTED) {
            ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.RECORD_AUDIO}, PERMISSIONS_REQUEST_RECORD_AUDIO);
            return;
        }
        btn=(Button)findViewById(R.id.buttonme);
        btn.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {
                Intent inte = new Intent(Farmax_2.this, MainMenu.class);
                startActivity(inte);


            }
        });



    t1 = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int status) {
                if (status != TextToSpeech.ERROR) {
                    t1.setLanguage(Locale.UK);
                }
            }
        });
        runRecognizerSetup();
    }
    public void omku(View view) { Intent in=new Intent(this,abtus.class);
        startActivity(in);}


    private void runRecognizerSetup() {
        // Recognizer initialization is a time-consuming and it involves IO,
        // so we execute it in async task
        new AsyncTask<Void, Void, Exception>() {
            @Override
            protected Exception doInBackground(Void... params) {
                try {
                    Assets assets = new Assets(Farmax_2.this);
                    File assetDir = assets.syncAssets();
                    setupRecognizer(assetDir);
                } catch (IOException e) {
                    return e;
                }
                return null;
            }

            @Override
            protected void onPostExecute(Exception result) {
                if (result != null) {

                } else {
                    switchSearch(KWS_SEARCH);
                }
            }
        }.execute();
    }

    @Override
    public void onRequestPermissionsResult(int requestCode,
                                           String[] permissions, int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);

        if (requestCode == PERMISSIONS_REQUEST_RECORD_AUDIO) {
            if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
                runRecognizerSetup();
            } else {
                finish();
            }
        }
    }

    @Override
    public void onDestroy() {
        super.onDestroy();

        if (recognizer != null) {
            recognizer.cancel();
            recognizer.shutdown();
        }
    }

    /**
     * In partial result we get quick updates about current hypothesis. In
     * keyword spotting mode we can react here, in other modes we need to wait
     * for final result in onResult.
     */
    @Override
    public void onPartialResult(Hypothesis hypothesis) {
        if (hypothesis == null)
            return;

        String text = hypothesis.getHypstr();
        switch (text) {
            case KEYPHRASE: {
                omkar();
                break;
            }

            case ahead: {
                Intent in = new Intent(this, abtus.class);
                startActivity(in);
                break;
                //  t1.speak("taking you to privacy policy of farmax.", TextToSpeech.QUEUE_FLUSH, null);
            }

            case PHONE_SEARCH: {

                Intent in = new Intent(this, MainMenu.class);
                startActivity(in);
                //   t1.speak("Main Menu.", TextToSpeech.QUEUE_FLUSH, null);
                break;
            }
        }
    }

    private void omkar() {
        t1.speak("Yes sir.", TextToSpeech.QUEUE_FLUSH, null);
        switchSearch(MENU_SEARCH);
    }

    /**
     * This callback is called when we stop the recognizer.
     */
    @Override
    public void onResult(Hypothesis hypothesis) {
        if (hypothesis != null) {
            String text = hypothesis.getHypstr();
            makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();

        }
    }

    @Override
    public void onBeginningOfSpeech() {
    }

    /**
     * We stop recognizer here to get a final result
     */
    @Override
    public void onEndOfSpeech() {

    }

    private void switchSearch(String searchName) {
        recognizer.stop();

        // If we are not spotting, start listening with timeout (10000 ms or 10 seconds).
        if (searchName.equals(KWS_SEARCH))
            recognizer.startListening(searchName);
        else
            recognizer.startListening(searchName, 10000);


    }

    private void setupRecognizer(File assetsDir) throws IOException {
        // The recognizer can be configured to perform multiple searches
        // of different kind and switch between them

        recognizer = SpeechRecognizerSetup.defaultSetup()
                .setAcousticModel(new File(assetsDir, "en-us-ptm"))
                .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))

                .setRawLogDir(assetsDir) // To disable logging of raw audio comment out this call (takes a lot of space on the device)

                .getRecognizer();
        recognizer.addListener(this);

        /** In your application you might not need to add all those searches.
         * They are added here for demonstration. You can leave just one.
         */

        // Create keyword-activation search.
        recognizer.addKeyphraseSearch(KWS_SEARCH, "hello");

        // Create grammar-based search for selection between demos
        File menuGrammar = new File(assetsDir, "firstscn.gram");
        recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);

        // Create grammar-based search for digit recognition

    }

    @Override
    public void onError(Exception error) {
    }

    @Override
    public void onTimeout() {
        switchSearch(KWS_SEARCH);
    }


}

when I say hello, it responds correctly by replying "Yes sir" via tts. But after that it is supposed to switch menu and wait for the further commands. In this case there are two.

@Override
    public void onPartialResult(Hypothesis hypothesis) {
        if (hypothesis == null)
            return;

        String text = hypothesis.getHypstr();
        switch (text) {
            case KEYPHRASE: {
                omkar();
                break;
            }

            case ahead: {
                Intent in = new Intent(this, abtus.class);
                startActivity(in);
                break;
                //  t1.speak("taking you to privacy policy of farmax.", TextToSpeech.QUEUE_FLUSH, null);
            }

            case PHONE_SEARCH: {

                Intent in = new Intent(this, MainMenu.class);
                startActivity(in);
                //   t1.speak("Main Menu.", TextToSpeech.QUEUE_FLUSH, null);
                break;
            }
        }
    }

But the problem is that It doesnt wait for my command after it switches the menu.

Sometimes a toast pops up with "about" or sometimes with "ahead" even though I dont speak them. The app freezes badly after that and leaves me no other option than to close it.

If have also tried if else statements other than switch and case. But they don't seem to help much. I have also tried to use the above code in onResult rather than onPartialResult but that doesnt help as well.

using sphinx tools I have created my own dictionary and grammar file. Here is the grammar file content for this case.

#JSGF V1.0;

grammar firstscn;

public <item> = about | ahead;

Where am I going wrong? Please help me.

You have to pause speech recognition while TTS is working, you should start recognition only when TTS is over. — Nikolay Shmyrev, Apr 16 '17 at 21:55
Possible duplicate of http://stackoverflow.com/questions/29109464/stop-pocketsphinx-recognizer-for-voice-feedback — Nikolay Shmyrev, Apr 16 '17 at 21:58

PocketSphinx on Android recognises words even if they aren't spoken

0 Answers0