3

I am using Google Speech REST API to convert speech to text, I am getting blank response. Here is my json which I was passing in Http Post Request:-

This is my code to get JSON:

File file = new File(mOutputFile.getAbsolutePath());
        byte[] bytes = new byte[0];
        try {
            bytes = loadFile(file);
        } catch (IOException e) {
            e.printStackTrace();
        }
        byte[] encoded = Base64.encodeBase64(bytes);
        String encodedString = new String(encoded);

        JSONObject config = new JSONObject();
        config.put("encoding", "FLAC");
        config.put("sampleRateHertz", 16000);
        config.put("languageCode", "en-US");
        config.put("enableWordTimeOffsets", false);
        JSONObject audio = new JSONObject();
        audio.put("content", "" + encodedString);
        JSONObject jsonObject = new JSONObject();
        jsonObject.put("config", config);
        jsonObject.put("audio", audio);
        try {
            HttpClient httpClient = new DefaultHttpClient();
            HttpPost post = new HttpPost("https://speech.googleapis.com/v1/speech:recognize?key=GOOGLE_API_KEY");
            post.setHeader("content-type", "application/json; charset=UTF-8");
            StringEntity entity = new StringEntity(jsonObject.toString());
            post.setEntity(entity);
            HttpResponse resp = httpClient.execute(post);
            s = EntityUtils.toString(resp.getEntity());
            Log.e("ExecuteTask Response", "--------------" + s);
        } catch (Exception e) {
            e.printStackTrace();
        }

    } catch (Exception exception) {
        exception.printStackTrace();
    }
Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87

1 Answers1

1

Have you checked the troubleshooting in the official docs ? There it says

"If a transcript is not returned (e.g. you receive an empty {} JSON response) and no errors have occurred, it's likely that the audio is not using the proper encoding."

You should make sure before you encode your file that your audio encoding matches the parameters of your JSON file. In your case "encoding": "FLAC" and "sampleRateHertz":16000

VictorGGl
  • 1,848
  • 10
  • 15
  • mRecorder.setAudioSource(MediaRecorder.AudioSource.MIC); mRecorder.setOutputFormat(MediaRecorder.OutputFormat.AMR_WB); mRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_WB);mRecorder.setAudioEncodingBitRate(16000); mRecorder.setAudioSamplingRate(16000); and I am passing ("encoding", "AMR_WB") in json; Tell me I am doing anything wrong. – Dhirendra Sengar Feb 02 '18 at 05:54
  • @DhirendraSengar You should verify that you can actually listen to the audio you are recording, that the audio is clear and the speech intelligible. Also take into account what says here about audio encoding: cloud.google.com/speech/reference/rest/v1/… "For best results, audio source should be FLAC or LINEAR16...Accuracy may be reduced if...other codecs...are used,...particularly if background noise is present" – VictorGGl Feb 05 '18 at 15:56
  • sir I am using this encoding where sample 16000, 44100 tried with both and ("encoding", "LINEAR16") :- record = new AudioRecord(MediaRecorder.AudioSource.MIC, sampleRate, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize); record.startRecording(); – Dhirendra Sengar Feb 06 '18 at 05:17
  • and still the blank response. – Dhirendra Sengar Feb 06 '18 at 05:21
  • @DhirendraSengar the encoding looks good. But did you listen to the audio file and verify that there is actually some sound, the audio is clear and the speech intelligible? – VictorGGl Feb 06 '18 at 08:49
  • yes sir I can clearly hear the recorded audio. – Dhirendra Sengar Feb 06 '18 at 10:55
  • @DhirendraSengar Can you provide a link to your audio file? Don't do that if it contains some private data or some sensitive information – VictorGGl Feb 06 '18 at 12:34