1

I am working through the provided code snippets from the Google Speech API, found here. The code should be enough to convert a .wav file into transcribed text.

The block of interest is here:

def transcribe_file(speech_file):
    """Transcribe the given audio file."""
    from google.cloud import speech
    speech_client = speech.Client()

    with io.open(speech_file, 'rb') as audio_file:
        content = audio_file.read()
        audio_sample = speech_client.sample(
            content=content,
            source_uri=None,
            encoding='LINEAR16',
            sample_rate_hertz=16000)

    alternatives = audio_sample.recognize('en-US')
    for alternative in alternatives:
        print('Transcript: {}'.format(alternative.transcript))

First, I think perhaps the code is old, and sample_rate_hertz=16000 had to be changed to sample_rate=16000.

After that, I got an error for this line:
alternatives = audio_sample.recognize('en-US')
which read
AttributeError: 'Sample' object has no attribute 'recognize'

I am curious about how to rectify this. I can't seem to find any documentation on this method. Maybe it needs to be replaced too.

Monica Heddneck
  • 2,973
  • 10
  • 55
  • 89
  • Please take a look [here](http://stackoverflow.com/questions/38703853/how-to-use-google-speech-recognition-api-in-python/38788928#38788928), because there is a similar working example – A. STEFANI Apr 20 '17 at 17:10

2 Answers2

1

You use the github quickstart.py example, so i wonder that's not in sync with the Documentation Google Cloud Speech API class sample. But it's still BETA.

Assuming isinstance(audio_sample, <class Sample(object)>) == True,
then .recognize in your

alternatives = audio_sample.recognize('en-US')

should be one of

async_recognize, streaming_recognize, sync_recognize
stovfl
  • 14,998
  • 7
  • 24
  • 51
1

You nead to read file as binary, then use service.speech().syncrecognize with a body argument (dict), which contain all required arguments like :

  • encoding,
  • samplerate
  • language)

May you try something like:

with open(speech_file, 'rb') as speech:
    speech_content = base64.b64encode(speech.read())

service = get_speech_service()
service_request = service.speech().syncrecognize(
    body={
        'config': {
            'encoding': 'LINEAR16',  # raw 16-bit signed LE samples
            'sampleRate': 16000,  # 16 khz
            'languageCode': 'en-US',  # a BCP-47 language tag
        },
        'audio': {
            'content': speech_content.decode('UTF-8')
            }
        })
response = service_request.execute()
print(json.dumps(response))

Please take a look here, because there is a similar working example.

Community
  • 1
  • 1
A. STEFANI
  • 6,707
  • 1
  • 23
  • 48