I am using cv2
to edit images and create a video from the frames with FFMPEG. See this post for more details.
The images are 3D RGB NumPy array
s (shape is like [h, w, 3]), they are stored in a Python list
.
Yep, I know cv2
has a VideoWriter
and I have used it before, but it is very inadequate to meet my needs.
Simply put, it can only use an FFMPEG
version that comes with it, that version does not support CUDA and uses up all CPU time when generating the videos while not using any GPU time at all, the output is way too big and I can't pass many FFMPEG parameters to the VideoWrite
initiation.
I downloaded precompiled binaries of FFMPEG for Windows with CUDA support here, I am using Windows 10 21H1 x64, and my GPU is NVIDIA Geforce GTX 1050 Ti.
Anyways I need to mess with all the parameters found here and there to find the best compromise between quality and compression, like this:
command = '{} -y -stream_loop {} -framerate {} -hwaccel cuda -hwaccel_output_format cuda -i {}/{}_%d.png -c:v hevc_nvenc -preset 18 -tune 1 -rc vbr -cq {} -multipass 2 -b:v {} -vf scale={}:{} {}'
os.system(command.format(FFMPEG, loops-1, fps, tmp_folder, file_name, quality, bitrate, frame_width, frame_height, outfile))
I need to use exactly the binary I downloaded and specify as many parameters as I can to achieve the optimal result.
Currently I can only save the arrays to a disk as images and use the images as input of FFMPEG, and that is slow but I need exactly that binary and all those parameters.
After hours of Google searching I found ffmpeg-python
, which seems perfect for the job, and I even found this : I can pass the binary path as an argument to the run
function, this
import ffmpeg
import io
def vidwrite(fn, images, framerate=60, vcodec='libx264'):
if not isinstance(images, np.ndarray):
images = np.asarray(images)
_,height,width,channels = images.shape
process = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height), r=framerate)
.output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
.overwrite_output()
.run_async(pipe_stdin=True, overwrite_output=True, pipe_stderr=True)
)
for frame in images:
try:
process.stdin.write(
frame.astype(np.uint8).tobytes()
)
except Exception as e: # should probably be an exception related to process.stdin.write
for line in io.TextIOWrapper(process.stderr, encoding="utf-8"): # I didn't know how to get the stderr from the process, but this worked for me
print(line) # <-- print all the lines in the processes stderr after it has errored
process.stdin.close()
process.wait()
return # cant run anymore so end the for loop and the function execution
However I need to pass all those parameters and possibly many more to the process and I am not sure where these parameters should be passed to (where should stream_loop
go? What about hwaccel
, hwaccel_output_format
, multipass
...?).
How do I properly pipeline a bunch of NumPy arrays to an FFMPEG process spawned by an binary that supports CUDA and pass all sorts of arguments to the initialization of that process?