I am trying to port a TTS app that utilizes in-text control tags from desktop/web/iOS to Android. The app makes a text file consisting of the text to be spoken and silent periods between the spoken words. Silent periods are represented with in-text control tags such as SAPI TTS <silence msec="1000"/>
tag or iOS TTS engine in-text control tag for silence [[slnc 10000]]
The text sent to the SAPI TTS speech synthesizer looks like this:
Text one <silence msec="750"/> text two <silence msec="1000"/> text three <silence msec="500"/> Text four <silence msec="600"/> Text five.....
Similarly for iOS TTS the in-text control tag for silence is [[slnc 10000]]
and the text to be sent to the speech synthesizer looks like this:
Text one [[slnc 750]] text two [[slnc 10000]] text three [[slnc 500]] text four [[slnc 600]] text five......
Android TTS doesn't seem to use in-text control tags for the speech synthesizer. Also the following two variants of the speech()
method use google web service so to achieve accurate timing of the spoken text coming back from the speech synthesizer server and the timing of the silence periods within the code may be impossible or unreliable at best.
speak(speech, TextToSpeech.QUEUE_FLUSH, null);
speak(speech, TextToSpeech.ADD_ADD, null);
I welcome any Android solution that focuses on preserving accurate timing of silence periods between spoken words.