3

In trying to get Speech to Text (IBM Voice Gateway IVR app) to recognize alpha-numeric character strings, I am wondering if I could create a custom grammar or entity that would restrict STT to recognizing just individual letters and numbers, excluding words altogether. For example, here's a typical string: 20Y0H8C. Watson comes back with words and numbers, like "two" instead of "2". Digit strings work fine. I realize that letter recognition is problematic with typical ASR, but I'm hoping Watson is up to the task. I noticed there are no system entities for alphanumeric characters. Any suggestions are much appreciated.

Grokify
  • 15,092
  • 6
  • 60
  • 81

1 Answers1

2

In this case, set smart_formatting to true.

The smart_formatting parameter converts dates, times, series of digits and numbers, phone numbers, currency values, and Internet addresses into more conventional representations in the final transcript of a recognition request. The conversion makes the transcript more readable and enables better post-processing of the transcription results. You set the parameter to true to enable smart formatting, as in the following example; by default, the parameter is false and smart formatting is not performed.

Check:

curl -X POST -u {username}:{password}
--header "Content-Type: audio/flac"
--data-binary @{path}audio-file.flac
"https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?smart_formatting=true"

Result:

Voice: The quantity is one million one hundred and one

Result: The quantity is 1000101

Check IBM Official documentation.

Note: The smart formatting feature is currently beta functionality that is available for US English only.

Community
  • 1
  • 1
Sayuri Mizuguchi
  • 5,250
  • 3
  • 26
  • 53
  • Thanks for your answer, but the issue is when letters are spoken in the string. Smart_formatting is already enabled, but there's nothing for alphanumeric strings. I've also tried using input.text.match("^[a-zA-Z0-9]*$"), which works while using a chat window, but is hit or miss using STT. The goal is to get watson to accept only alphanumeric strings, thus really constricting the scope. The data is fixed-length strings (7 chars) and the letters can be anywhere. For example: HV00310. – Wilson the Dog Jun 14 '17 at 19:45
  • I should note that I'm using IBM Voice Gateway (STT is narrowband). – Wilson the Dog Jun 14 '17 at 20:02