Can the Google Speech API be configured to only return numbers and letters, as opposed to full words?
The use case is translating Canadian postal codes. Ex. M 1 B 0 R 3. Google may return "Em 1 Be 0 Are 3"
We have tried:
- Using
speechContexts
and feeding in letters A - Z, as individual phrases. This improved the accuracy for us. We did not have much success passing in individual numbers (ex 1, 2, 3). - Specifying the codec and sample rate of our WAV file using the
encoding
andsampleRateHertz
configuration options. We saw no improvement in doing this as we believe Google already does a great job of auto-recognizing the the sample rate and encoding.
Our audio file is 8000hz and encoded with "M-ULAW". We have no flexibility in changing the sample rate or encoding.
Is there a way to get a more accurate response from Google for this use case? Even ideas for better speechContexts
phrases are welcome.
Thank you