I am trying to use Amazon Lex as the conversation engine in a home assistant via the Python SDK. The post_content method seems appropriate and I did get it to work on text-only test examples. However, I am unable to figure out how to interact directly using audio streaming.
import pyaudio
import boto3
pa = pyaudio.PyAudio()
audio_stream = pa.open(
rate=16000,
channels=1,
format=pyaudio.paInt16,
input=True,
frames_per_buffer=1024,
)
lex_client = boto3.client("lex-runtime")
response = lex_client.post_content(
botName="BOT_NAME",
botAlias="BOT_ALIAS",
userId="USER_ID",
contentType="audio/l16; rate=16000; channels=1",
inputStream=audio_stream,
)
print(response)
This raises the following error:
botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: 'Stream' object is not iterable
Fair enough, so I tried inputStream=audio_stream.read(1024)
, which works without a problem, but doesn't recognize any spoken text (i.e. 'inputTranscript': ''
in the response). I imagine this is because the chunk is simply too short to contain meaningful text.
I am fairly inexperienced with web development so I suspect I am missing something very obvious. Looking at how audio streaming is apparently handled in Amazon Transcribe, it seems like I should be using async and callback functions.
How should I properly handle this stream? If there are fundamental things I should be understanding better, I'd also really appreciate pointers to the right resources.