My team is buildung an app which is using the SpeechRecognizer for Android. The app is acting as a way to talk to the user's banking account, means the user is able to ask the app "What is my balance?". Then the app will get the data from a banking backend (speech is converted to a text intend) and present the data to the user via speech and in a chat bot. We are unsing the SpeechRecognizer component of Android for handling transforming speech to text and the way back.
Another command for the app could be: "Transfer 50 Euro to the account of my wife with the number 12314567893231."
My questions are:
- Does the processing from Speech-to-Text and from Text-to-Speech occur on the device or on the Google server?
- If processing is done on the server: what data is cached / stored on the Google server in this case?
- Is the data stored in such a way that conclusions about account balances and receivers are not possible?
Any ideas about this topic?