We may write the WAV data to FFmpeg stdin
pipe, and read the encoded OGG data from FFmpeg stdout
pipe.
My following answer describes how to do it with video, and we may apply the same solution to audio.
Piping architecture:
-------------------- Encoded --------- Encoded ------------
| Input WAV encoded | WAV data | FFmpeg | OGG data | Store to |
| stream | ----------> | process | ----------> | BytesIO |
-------------------- stdin PIPE --------- stdout PIPE -------------
The implementation is equivalent to the following shell command:
cat input.wav | ffmpeg -y -f wav -i pipe: -acodec libopus -f ogg pipe: > test.ogg
According to Wikipedia, common audio codecs for OGG format are Vorbis, Opus, FLAC, and OggPCM (I selected Opus audio codec).
The example uses ffmpeg-python module, but it's just a binding to FFmpeg sub-process (FFmpeg CLI must be installed, and must be in the execution path).
Execute FFmpeg sub-process with stdin
pipe as input and stdout
pipe as output:
ffmpeg_process = (
ffmpeg
.input('pipe:', format='wav')
.output('pipe:', format='ogg', acodec='libopus')
.run_async(pipe_stdin=True, pipe_stdout=True)
)
The input format is set to wav
, the output format is set to ogg
and the selected encoder is libopus
.
Assuming the audio file is relatively large, we can't write the entire WAV data at once, because doing so (without "draining" stdout
pipe) causes the program execution to halt.
We may have to write the WAV data (in chunks) in a separate thread, and read the encoded data in the main thread.
Here is a sample for the "writer" thread:
def writer(ffmpeg_proc, wav_bytes_arr):
chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important).
n_chunks = len(wav_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end).
remainder_size = len(wav_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size).
for i in range(n_chunks):
ffmpeg_proc.stdin.write(wav_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process.
if (remainder_size > 0):
ffmpeg_proc.stdin.write(wav_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process.
ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process.
The "writer thread" writes the WAV data in small chucks.
The last chunk is smaller (assume the length is not a multiple of chuck size).
At the end, stdin
pipe is closed.
Closing stdin
finish encoding the data, and closes FFmpeg sub-process.
In the main thread, we are starting the thread, and read encoded "OGG" data from stdout
pipe (in chunks):
thread = threading.Thread(target=writer, args=(ffmpeg_process, wav_bytes_array))
thread.start()
while thread.is_alive():
ogg_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe
out_stream.write(ogg_chunk) # Write the encoded chunk to the "in-memory file".
For reading the remaining data, we may use ffmpeg_process.communicate()
:
# Read the last encoded chunk.
ogg_chunk = ffmpeg_process.communicate()[0]
out_stream.write(ogg_chunk) # Write the encoded chunk to the "in-memory file".
Complete code sample:
import ffmpeg
import base64
from io import BytesIO
import threading
# Equivalent shell command
# cat input.wav | ffmpeg -y -f wav -i pipe: -acodec libopus -f ogg pipe: > test.ogg
# Writer thread - write the wav data to FFmpeg stdin pipe in small chunks of 1KBytes.
def writer(ffmpeg_proc, wav_bytes_arr):
chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important).
n_chunks = len(wav_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end).
remainder_size = len(wav_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size).
for i in range(n_chunks):
ffmpeg_proc.stdin.write(wav_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process.
if (remainder_size > 0):
ffmpeg_proc.stdin.write(wav_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process.
ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process.
# The example reads the decode_string from a file, assume: decoded_bytes_array = base64.b64decode(audioString)
with open('input.wav', 'rb') as f:
wav_bytes_array = f.read()
# Encode as base64 and decode the base64 - assume the encoded and decoded data are bytes arrays (not UTF-8 strings).
dat = base64.b64encode(wav_bytes_array) # Encode as Base64 (used for testing - not part of the solution).
wav_bytes_array = base64.b64decode(dat) # wav_bytes_array applies "decode_string" (from the question).
# Execute FFmpeg sub-process with stdin pipe as input and stdout pipe as output.
ffmpeg_process = (
ffmpeg
.input('pipe:', format='wav')
.output('pipe:', format='ogg', acodec='libopus')
.run_async(pipe_stdin=True, pipe_stdout=True)
)
# Open in-memory file for storing the encoded OGG file
out_stream = BytesIO()
# Starting a thread that writes the WAV data in small chunks.
# We need the thread because writing too much data to stdin pipe at once, causes a deadlock.
thread = threading.Thread(target=writer, args=(ffmpeg_process, wav_bytes_array))
thread.start()
# Read encoded OGG data from stdout pipe of FFmpeg, and write it to out_stream
while thread.is_alive():
ogg_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe
out_stream.write(ogg_chunk) # Write the encoded chunk to the "in-memory file".
# Read the last encoded chunk.
ogg_chunk = ffmpeg_process.communicate()[0]
out_stream.write(ogg_chunk) # Write the encoded chunk to the "in-memory file".
out_stream.seek(0) # Seek to the beginning of out_stream
ffmpeg_process.wait() # Wait for FFmpeg sub-process to end
# Write out_stream to file - just for testing:
with open('test.ogg', "wb") as f:
f.write(out_stream.getbuffer())