Decoding to specific pixel format in ffmpeg with C++

Question

I need to decode video but my video player only supports RGB8 pixel format. So I'm looking into how to do pixel format conversion in the GPU, preferably in the decoding process, but if not possible, after it.

I've found How to set decode pixel format in libavcodec? which explains how to decode video on ffmpeg to an specific pixel format, as long as it's suported by the codec.

Basically, get_format() is a function which chooses, from a list of supported pixel formats from the codec, a pixel format for the decoded video. My questions are:

Is this list of supported codec output formats the same for all computers? For example, if my codec is for H264, then it will always give me the same list on all computers? (assuming same ffmpeg version of all computers)
If I choose any of these supported pixel formats, will the pixel format conversion always happen in the GPU?
If some of the pixel format conversions won't happen in the GPU, then my question is: does sws_scale() function converts in the GPU or CPU?

score 3 · Accepted Answer · answered Jul 26 '19 at 08:37

It depends. First, H264 is just a Codec standard. While libx264 or openh264 are implementing this standard you can guess that each implementation supports different formats. But let's assume (as you did in your question) you are using the same implementation on different machines then yes there might be still cases where different machines support different formats. Take H264_AMF for example. You will need an AMD graphics card to use the codec and the supported formats will depend on your graphics card as well.
Decoding will generally happen on your CPU unless you explicitly specify a hardware decoder. See this example for Hardware decoding: https://github.com/FFmpeg/FFmpeg/blob/release/4.1/doc/examples/hw_decode.c
When using Hardware decoding you are heavily relying on your machine. And each Hardware encoder will output their own (proprietary) frame format e.g. NV12 for a Nvida Graphics Card. Now comes the tricky part. The encoded frames will remain on your GPU memory which means you might be able to reuse the avframe buffer to do the pixel conversion using OpenCL/GL. But achieving GPU zero-copy when working with different frameworks is not that easy and I don't have enough knowledge to help you there. So what I would do is to download the decoded frame from the GPU via av_hwframe_transfer_data like in the example. From this point on it doesn't make much of a difference if you used hardware or software decoding.
To my knowledge sws_scale isn't using hardware acceleration. Since it's not accepting "hwframes". If you want to do color conversion on Hardware Level you might wanna take a look at OpenCV you can use GPUMat there then upload your frame, call cvtColor and download it again.

Some general remarks:
Almost any image operation scaling etc. is faster on your GPU, but uploading and downloading the data can take ages. For single operations, it's often not worth using your GPU.
In your case, I would try to work with CPU decoding and CPU color conversion first. Just make sure to use well threaded and vectorized algorithms like OpenCV or Intel IPP. If you still lack performance then you can think about Hardware Acceleration.

I've heard that the Jetson Nano team from NVIDIA is going to support hardware decoding with ffmpeg. Since that is an ARM board with a very powerful GPU (which can decode 8 simultaneous 1080p streams), but with a very limited CPU, I don't even wanna try CPU color conversion, I'm sure I'm gonna use to have GPU color conversion and possibly no memory copying. I'm writing a Network Video Recorder so I need to have the maximum performance because there'll be lots of simultaneous decoding of streams. I'm gonna search about GPU color conversion, thank you so much! — PPP, Jul 26 '19 at 14:05

Decoding to specific pixel format in ffmpeg with C++

1 Answers1