In our church we have a few Ukrain refugees that visit the churc. To give them un understanding of the sermon, I made an app to send the translations real-time to Telegram.
I have implemented the Google speech-to-text API following this tutorial: https://github.com/googleapis/java-speech/blob/main/samples/snippets/src/main/java/com/example/speech/InfiniteStreamRecognize.java
This works well, but the recognition is often not accurate enough. Is it possible in Google to add audio files with transcriptions so that it can learn the output of the speaker? We have always the same speaker so if I can get Google to 'know' the speaker, I think the accuracy can be much higher. Or maybe somebody has another idea how to improve the accuracy? I did try the speech adaption boost (https://cloud.google.com/speech-to-text/docs/boost), but that wasn't really helpful.