Is there a way to calculate the output frame dimensions when compressing a video and extracting its frames with ffmpeg

Question

I use the following code to compress a video and extract its frames. Note that I do not want to save the resulting video.

        output_args = {
            "vcodec": "libx265",
            "crf": 24,
        }
        out, err = (
            ffmpeg
            .input(in_filename)
            .output('pipe:', format='rawvideo', pix_fmt='rgb24',**output_args)
            .run(capture_stdout=True)
        )
        frames = np.frombuffer(out, np.uint8).reshape(-1,width,height,3)

When i try to reshape the output buffer to original video dimensions I get the following error: cannot reshape array of size 436567 into shape (1920,1080,3) This is expected because the resulting video has smaller dimensions. Is there a way to calculate the number of frames, width, and height of the compressed video in order to reshape the frames from the buffer?

Also, if I save the compressed video, instead of loading its frames, and then I load the video frames from the compressed video, these would have the same dimensions as the original. I suspect that there is some sort of interpolation happening under the hood. Is there a way to apply it without saving the video?

Rotem · Answer 1 · 2020-02-01T09:22:12.310

I found a solution using ffmpeg-python.

Assumptions:

out holds the entire h265 encoded stream in memory buffer.
You don't want to write the stream into a file.

The solution applies the following:

Execute FFmpeg in a sub-process with sdtin as input pipe and stdout as output pipe.
The input is going to be the video stream (memory buffer).
The output format is raw video frames in BGR pixel format.
Write stream content to the pipe (to stdin).
Read decoded video (frame by frame), and display each frame (using cv2.imshow)

For testing the solution, I created a sample video file, and read it into memory buffer (encoded as H.265).
I used the memory buffer as input to the above code (your out buffer).

Here is the complete code, include the testing code:

import ffmpeg
import numpy as np
import cv2
import io

in_filename = 'in.mp4'

# Build synthetic video, for testing begins:
###############################################
# ffmpeg -y -r 10 -f lavfi -i testsrc=size=192x108:rate=1 -c:v libx265 -crf 24 -t 5 in.mp4

width, height = 192, 108

(
    ffmpeg
    .input('testsrc=size={}x{}:rate=1'.format(width, height), r=10, f='lavfi')
    .output(in_filename, vcodec='libx265', crf=24, t=5)
    .overwrite_output()
    .run()
)
###############################################


# Use ffprobe to get video frames resolution
###############################################
p = ffmpeg.probe(in_filename, select_streams='v');
width = p['streams'][0]['width']
height = p['streams'][0]['height']
n_frames = int(p['streams'][0]['nb_frames'])
###############################################


# Stream the entire video as one large array of bytes
###############################################
# https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
in_bytes, _ = (
    ffmpeg
    .input(in_filename)
    .video # Video only (no audio).
    .output('pipe:', format='hevc', crf=24)
    .run(capture_stdout=True) # Run asynchronous, and stream to stdout
)
###############################################


# Open In-memory binary streams
stream = io.BytesIO(in_bytes)

# Execute FFmpeg in a subprocess with sdtin as input pipe and stdout as output pipe
# The input is going to be the video stream (memory buffer)
# The output format is raw video frames in BGR pixel format.
# https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
# https://github.com/kkroening/ffmpeg-python/issues/156
# http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/
process = (
    ffmpeg
    .input('pipe:', format='hevc')
    .video
    .output('pipe:', format='rawvideo', pix_fmt='bgr24')
    .run_async(pipe_stdin=True, pipe_stdout=True)
)


# https://stackoverflow.com/questions/20321116/can-i-pipe-a-io-bytesio-stream-to-subprocess-popen-in-python
# https://gist.github.com/waylan/2353749
process.stdin.write(stream.getvalue())  # Write stream content to the pipe
process.stdin.close()  # close stdin (flush and send EOF)


# Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while(True):
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break

    # transform the byte read into a numpy array
    in_frame = (
        np
        .frombuffer(in_bytes, np.uint8)
        .reshape([height, width, 3])
    )

    # Display the frame
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

process.wait()
cv2.destroyAllWindows()

Note: I used stdin and stdout instead of names pipes because I wanted the code to work both in Windows and Linux.

Thanks for the answer. However, the reason I posted the question is that I want to compress the video first and then load the frames on RAM. So I need to use compression parameters like `vcodec` and `crf`. Are you saying that I cannot get raw RGB with these parameters? — xro7, Jan 29 '20 at 23:35
I don't know if it's possible. In case it's possible, it's complicated: you need to build a filter graph with two output streams: compressed stream that saves to disk, and uncompressed that goes to pipe. It's much more simple to use two `FFmpeg` commands. — Rotem, Jan 29 '20 at 23:40
In case I misunderstand you, and you want compressed sequence in RAM, you **can't** reshape it into frames, because you need to decode it first (for decoding, you can use FFmpeg [complicated], or use OpenCV capture). I don't know if it's going to work with all the (compressed) video stream in the RAM. — Rotem, Jan 29 '20 at 23:46
Yes, I want the compressed sequence in RAM. If I save the compressed video in the disk first, then I can extract the compressed frames in RAM using the original video's shape. But my constraint is that don't want to use the disk. I will maybe search if there is a way to decode it first as you suggested. — xro7, Jan 29 '20 at 23:51
I posted a new example that works with compressed sequence in RAM. The code sample does not address your question specifically. With minor modifications, it should solve your problem. — Rotem, Feb 01 '20 at 09:29
Thanks for the effort. I will check this when I 'll find the time and come back to you. Much appreciated — xro7, Feb 01 '20 at 10:22

Is there a way to calculate the output frame dimensions when compressing a video and extracting its frames with ffmpeg

1 Answers1