Google speech API v1beta1 (syncrecognize and asyncrecognize API call)

Question

I am a Java developer and I have couple of questions related to Google speech API V1Beta1.

Question1 (Syncrecognize case):

I tried to upload (through GCS) small size (less than one min running file) audio file to google speech api it is working But the confidence output level is 0.32497215 only. That is my result is not exactly same to my audio input.

How to increase the confidence level output?

Question 2 (Asyncrecognize case):

I tried big size audio file (more than one min running file). This case I used the API call:

https://speech.googleapis.com/v1beta1/speech:asyncrecognize?key=XXXXXXXXXXXXXXXXXXXX

and Payload:

"{"config":{"encoding":"LINEAR16","sample_rate": 16000},"audio":{"uri":"gs://" + bucketName +"/"+ objectName + ""}}"

Here I got the output json like

{"name": "57...........................95"}.

After getting this output I make new API call (Operation interface) with this name value.

https://speech.googleapis.com/v1beta1/operations/57.................................95?key=XXXXXXXXXXXXXXXXX

I got the output

{
 "name": "57....................................95",
 "done": true,
 "response": {
   "@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse"
 }
}

How to proceed the work with this value? I need to get audio speech text.

Please help me to fix this issues. Thanks in advance.

On the second part, related question is http://stackoverflow.com/questions/38906527/asyncrecognize-result-is-empty — Nikolay Shmyrev, Aug 12 '16 at 02:46
Please split question into 2. Are you using examples from https://github.com/GoogleCloudPlatform/java-docs-samples (speech)? — Clemens Tolboom, Sep 01 '16 at 12:27

score 1 · Answer 1 · answered Oct 21 '16 at 09:11

Ideas to Question 1:

You should give more details in RecognitionConfig object, for example specify the languageCode and add hints via the SpeechContext object.

Answer to Question 2:

Check the sample rate of the audio file, you must be sure that is equal to the rate you gave in the request. You can check it e.g. with the following code soxi audio_file.flac (sox needed for this one).

Google speech API v1beta1 (syncrecognize and asyncrecognize API call)

Question1 (Syncrecognize case):

Question 2 (Asyncrecognize case):

1 Answers1