0

I have obtained the results from google speech in a variable

data = {'name': '1235433175192040985', 'metadata': {'@type': 'type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata', 'progressPercent': 100, 'startTime': '2018-04-11T12:56:58.237060Z', 'lastUpdateTime': '2018-04-11T12:57:44.944653Z'}, 'done': true, 'response': {'@type': 'type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeResponse', 'results': [{'alternatives': [{'transcript': 'hi how are you', 'confidence': 0.92438406}]}, {'alternatives': [{'transcript': 'How are you doing?', 'confidence': 0.9402676}]}]}}

json_dict = json.loads(data)

On this it throws error

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

For the rest of parsing I wrote

for result in json_dict["response"]["results"]:
  if "alternatives" in result:
    alternatives = result["alternatives"][0]
    if "confidence" in alternatives:
      print(alternatives["confidence"])
    if "transcript" in alternatives:
      print(alternatives["transcript"])

What am I doing wrong?

  • Is there any way or in language that i can parse this data, I have tons of this data and I don't want to regenerate all of this again. –  Apr 18 '18 at 11:38
  • Could you paste the original JSON? `json.loads` receives a string and you are using a `dict`. – snahor Apr 18 '18 at 12:26

3 Answers3

0

The JSON parser in Python expects your blob to use double quotation marks, since that is the JSON standard.

{
  "name": "John Doe"
}

You could replace the single quotes with double quotes, as explained in this answer.

However, I’m pretty sure the problem can be solved elsewhere, since the Google API most likely uses valid JSON in it’s responses. How do you parse the response from Google’s API?

0

The problem in your snippet is that you're passing a dict to json.loads. json.loads decodes json to dict, so it redundant and wrong. read the docs

Nitzan M
  • 77
  • 8
0

The dict doesn't need any further json methods, you can work with it as is.

for result in data["response"]["results"]:
  if "alternatives" in result:
    alternatives = result["alternatives"][0]
    if "confidence" in alternatives:
      print(alternatives["confidence"])
    if "transcript" in alternatives:
      print(alternatives["transcript"])

Yields this output:

0.92438406
hi how are you
0.9402676
How are you doing?
Chris Decker
  • 478
  • 3
  • 11