Can the Google Speech API be configured to return only numbers / letters?

Question

Can the Google Speech API be configured to only return numbers and letters, as opposed to full words?

The use case is translating Canadian postal codes. Ex. M 1 B 0 R 3. Google may return "Em 1 Be 0 Are 3"

We have tried:

Using speechContexts and feeding in letters A - Z, as individual phrases. This improved the accuracy for us. We did not have much success passing in individual numbers (ex 1, 2, 3).
Specifying the codec and sample rate of our WAV file using the encoding and sampleRateHertz configuration options. We saw no improvement in doing this as we believe Google already does a great job of auto-recognizing the the sample rate and encoding.

Our audio file is 8000hz and encoded with "M-ULAW". We have no flexibility in changing the sample rate or encoding.

Is there a way to get a more accurate response from Google for this use case? Even ideas for better speechContexts phrases are welcome.

Thank you

You also asked https://stackoverflow.com/questions/45312110/can-microsoft-bing-speech-be-configured-to-return-only-numbers-letters — Nikolay Shmyrev, Jul 26 '17 at 17:11
In such a case it is better to train open source recognizer, it will be much more responsive too. — Nikolay Shmyrev, Jul 26 '17 at 17:12
I presume you are referring to a tool such as CMUSphinx, which I see you are a developer for. I can give this a shot, as this is a greenfield project. — Bobby Bruce, Jul 27 '17 at 13:29
Yes - I did ask the same question as I've been testing with Bing Speech as well. That question is slightly different though, as i believe Microsoft offers more granular controls, or "scenarios", to interpret speech. My current accuracy is poor - about 35% match rate. — Bobby Bruce, Jul 27 '17 at 13:31
I gave an answer at https://stackoverflow.com/questions/45312110/can-microsoft-bing-speech-be-configured-to-return-only-numbers-letters/45360883#45360883 I'm going to flag this one as a duplicate. Go ahead and edit your first question to include more information if you want. — John Wiseman, Jul 27 '17 at 22:41
Possible duplicate of [Can Microsoft Bing Speech be configured to return only numbers / letters?](https://stackoverflow.com/questions/45312110/can-microsoft-bing-speech-be-configured-to-return-only-numbers-letters) — John Wiseman, Jul 27 '17 at 22:42
@JohnWiseman these are two similar questions, but discuss two very different APIs — Bobby Bruce, Jul 28 '17 at 12:06

score 1 · Answer 1 · answered Aug 30 '18 at 10:17

We are experiencing the same results, we would love to have a syntax based "context" suggestion or a parameter to force only digit return variable.

Changes in api version isn't fixing the way the digits are recognised, not even using model: phone_call.

What actually was better for recognising some kind of numbers, was to switch to en_US locale and that in turn forced the recognition engine to identify a list of numbers as a phone. So it was returned in phone-like syntax with +XXX-XXX-XXX-XXXX and this made detection really really good.

So I don't understand why Google has syntax matching behind the curtains and doesn't make it available through their api.

Can the Google Speech API be configured to return only numbers / letters?

1 Answers1

Linked