Is there a direct way to render/encode Vulkan output as an ffmpeg video file?

Question

I'm about to generate 2D and 3D music animations and render them to video using C++. I was thinking about using OpenGL, but I've read that, unfortunately, it is being discontinued in favour of Vulkan, which seems to offer higher performance using a GPU, but is also a lower-level API, making it more difficult to learn. I still have almost no knowledge in both OpenGL and Vulkan, beginning to learn now.

My question is:

is there a way to encode the Vulkan render output (showing a window or not) into a video file, preferentially through FFPMEG? If so, how could I do that?

Requisites:

Speed: the decrease in performance should be nearly that of encoding the video only, not much more than that (e.g. by having to save lossless frames as images first and then encoding a video from them).
Controllable FPS and resolution: the video fps and frame resolution can be freely chosen.
Reliability, reproducibility: running a code that gives a same Vulkan output twice should result in 2 equal videos independently of the system, i.e. no dropping frames, async problems (I want to sync with audio) or whatsoever. The chosen video fps should stay fixed (e.g. 60 fps), no matter if the computer can render 300 or 3 fps.

What I found out so far:

An example of taking "screenshots" from Vulkan output: it writes to a ppm image at the end, which is a binary uncompressed image file.
An encoder for rendering videos from OpenGL output, which is what I want, but using OpenGL in that case.
That Khronos includes in the Vulkan API a video subset.
A video tool to decode, demux, process videos using FFMPEG and Vulkan.
That is possible to render the output into a buffer without the need of a screen to display it.

There's libavformat/libavcodec which are the libraries used by ffmpeg — user253751, Apr 26 '22 at 17:21
Note that in my understanding, ffmpeg is not a file format, but a project that *encompasses software implementations of video and audio compressing and decompressing algorithms*, according to [wikipedia](https://en.wikipedia.org/wiki/FFmpeg). — Damien, Apr 26 '22 at 17:32
I'm not clear on what you're asking here. The examples you've found seem to cover all the individual components of what you need. You need to write the code to create the actual scene you want to render but aside from that, all you need to do is capture each frame and send it to either the ffmpeg library or the VK_KHR_video_encode_queue extension. Both will encode video, and ffmpeg is likely to be more widespread, but the extension is likely to be faster and executed on the GPU. — Jherico, Apr 26 '22 at 18:42
Capturing the frame from the framebuffer is what is happening in the `saveScreenshot` method of the screenshots example you linked to. That code shows how to move an image from GPU management memory into system memory, at which point you're free to push it into whatever encoding mechanism you want. — Jherico, May 11 '22 at 19:23

score 0 · Answer 1 · answered Sep 13 '22 at 06:44

First of all, ffmpeg is a framework used for video encoding and decoding. Second, if you have no experience with any of the GPU rendering API you should start with OpenGL. Vulkan is very low-level and complicated. OpenGL will be here for a very long time and will not be immediately replaced with Vulkan.

The off-screen rendering option you mentioned is probably the best one. It doesn't really matter though, you can also use the image from the framebuffer. The image is just a matrix of RGBA pixels. You need these data as the input for the video encoding. Please take a look at how ffmpeg works. You need to send the rendered frame data in the encoder which produces video packets that are stored in a video file. You need to chose a container (mp4, mkv, avi,...) and video format (h265, av1, vp9,...). You can of course implement a frame limiter and render the scene with a constant framerate or just pick the frames that have a constant timestep.

The performance problem happens, when you transfer the data from RAM to GPU memory and vice versa. For example, when downloading the rendered image from the buffer and passing it to the CPU encoder. Therefore, the most optimal approach would be with Vulkan, using the new video extension and directly sending the rendered frames in the HW accelerated encoder without any transfers from the GPU memory. You can also run the encoder in a different thread to make it work asynchronously.

But honestly, it's not trivial. The most simple solution (not realtime) for you to create a video from 3D render would be to:

Create a fixed FPS game loop
Make screenshots of the scene by downloading the framebuffer data in OGL or Vulkan
Process the frames by ffmpeg binary to create a video file

Another hack would be to use a screen recording software (OBS, Fraps, etc.) to create the video form your 3D app.

Is there a direct way to render/encode Vulkan output as an ffmpeg video file?

My question is:

Requisites:

What I found out so far:

1 Answers1