How to decode AWS Kinesis Video Stream GetMedia API output to mp3/wav?

Question

I ingested data to (Kinesis Video Stream) KVS via AWS Connect service now using GetMedia API am able to extract the Payload but how can I convert this output to a mp3/wav ? I want to ingest this output to AWS Transcribe service to get text format of audio call ingested by AWS Connect service to KVS.

Output of Payload for below code is like :

00#AWS_KINESISVIDEO_CONTINUATION_TOKEND\x87....\x1faudio/L16;rate=8000;channels=1;\x12T\xc......00"AWS_KINESISVIDEO_MILLIS_BEHIND_NOWD\x87\x10\x00\x00\x074564302g\xc8\x10\x00\x00^E\xa3\x10\x00\x00#AWS_KINESISVIDEO_CONTINUATION_TOKEND\x87\x10\x00\x00/91343852333181432506572546233025969374566791063'

Note: Above response was too long, so pasted some of it.

import json
import boto3

kinesis_client = boto3.client('kinesisvideo', region_name='us-east-1')

response = kinesis_client.get_data_endpoint(
    StreamARN='arn:aws:kinesisvideo:us-east-1:47...,
    APIName='GET_MEDIA')

t = response['DataEndpoint']
video_client = boto3.client('kinesis-video-media', endpoint_url=t, region_name='us-east-1')
stream = video_client.get_media(
    StreamARN='arn:aws:kinesisvideo:us-east-1:47...',
    StartSelector={'StartSelectorType': 'EARLIEST'})

streamingBody = stream['Payload']
print(streamingBody.read())

Please suggest how can I convert payload output to mp3/wav etc.

How have you solved this problem? I have quite similar problem - I need to extract 1st frame of video from Payload. — py_ml, Jul 02 '19 at 06:00
@py_ml my team followed and deployed this: github.com/aws-samples/amazon-connect-realtime-transcription — sudhir tataraju, Jul 02 '19 at 16:01

score 1 · Answer 1 · answered Apr 01 '19 at 11:00

1

I am facing the same problem, I can export the payload to S3 as a raw file but when I listen it, it is not properly audible like it was a crypted conversation.

I just save the payload into a file.

f = open("myAudio.wav", 'w+b')
f.write(stream['Payload'].read())
f.close()

answered Apr 01 '19 at 11:00

marxan

17
1
3

can you convert that audio as text ? using below code and see whether audio proper converting to text or not ? import speech_recognition as sr r= sr.Recognizer() audio='myAudio.wav' with sr.AudioFile(audio) as source: print('Started!') audio =r.record(source) print('Done!') try: text=r.recognize_google(audio) print(text) except Exception as e: print(e) – sudhir tataraju Apr 01 '19 at 16:11
By the way its waste to try, we had chat with AWS technical team, they clearly told we can parse the kinesis mkv formatted media only using java not using python as of now. so follow below link step by step to deploy aws connect-transcribe --no need to know java just follow steps as it is https://github.com/aws-samples/amazon-connect-realtime-transcription our team succeeded doing same without java knowledge, hope you will be able to. – sudhir tataraju Apr 01 '19 at 17:02
1

Hey, thanks for your answer. I don't even try yet to transcribe the audio. At the moment, I only want to save it in a S3 bucket and then just listen it as it was a voicemail. But somehow the audio file is not properly audible. Did you manage to convert the payload in a listenable wav file? – marxan Apr 02 '19 at 06:35
Yes my colleagues done using the code in that github link I provided above. What is your audio producer ? Imean from where you are ingesting audio to kinesis ? – sudhir tataraju Apr 02 '19 at 10:20
1

From AWS connect. But the link you provided just explains how to transcribe the audio. I thought your concern was to output the payload in wav or mp3 format. Have you succeeded with this issue in python. I don't to transcribe it yet, I just want to save the payload from the getMedia function into a file that I could listen with Audacity or quicktime player for example. – marxan Apr 02 '19 at 11:03
Then you looking for non-realtime solution in such case you dont need kinesis itself, kinesis is for realtime streaming, so to save in s3 bucket use recording block instead of kinesis block in your contact flow in aws connect, the s3 bucket link in which audio saved will be available in your aws connect account settings. refer this link https://stackoverflow.com/questions/48953359/voice-message-save-in-aws-s3-bucket-using-amazon-connect – sudhir tataraju Apr 03 '19 at 04:30
Well not really, the agent must be listening in order to record it. In my case, we want to create a voicemail which means no agent will take the call. – marxan Apr 03 '19 at 06:47
In same record block, you get to see option whether to record only customer voice or both agent and customer voice and yes agent can be even a computer(IVR) i.e play prompt for which prompt block is available in aws connect. – sudhir tataraju Apr 03 '19 at 07:53
1

@marxan hi , did you find the answer for this ? – James_RajKumar Feb 06 '20 at 11:34
Hi James_Rajkumar, not in python unfortunately we had to to do it in java instead, aws has a solution for it. – marxan Feb 07 '20 at 16:13

How to decode AWS Kinesis Video Stream GetMedia API output to mp3/wav?

1 Answers1

Linked