I'm trying to write a Python script for processing audio data stored on S3.
I have an S3 object which I'm calling using
def grabAudio(filename, directory):
obj = s3client.get_object(Bucket=bucketname, Key=directory+'/'+filename)
return obj['Body'].read()
Accessing the data using
print(obj['Body'].read())
yields the correct audio information. So its accessing the data from the bucket just fine.
When I try to then use this data in my audio processing library (pydub), it fails:
audio = AudioSegment.from_wav(grabAudio(filename, bucketname))
Traceback (most recent call last):
File "split_audio.py", line 38, in <module>
audio = AudioSegment.from_wav(grabAudio(filename, bucketname))
File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 544, in from_wav
return cls.from_file(file, 'wav', parameters)
File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 456, in from_file
file.seek(0)
AttributeError: 'bytes' object has no attribute 'seek'
What is the format of the object coming in from s3? Byte array I presume? If so, is there a way of parsing it into a .wav format without having to save to disk? I'm trying to refrain from saving to disk.
Also open to alternative audio processing libraries.