What's FFmpeg doing with avcodec_send_packet()?

Question

I'm trying to optimise a piece of software for playing video, which internally uses the FFmpeg libraries for decoding. We've found that on some large (4K, 60fps) video, it sometimes takes longer to decode a frame than that frame should be displayed for; sadly, because of the problem domain, simply buffering/skipping frames is not an option.

However, it appears that the FFmpeg executable is able to decode the video in question fine, at about 2x speed, so I've been trying to work out what we're doing wrong.

I've written a very stripped-back decoder program for testing; the source is here (it's about 200 lines). From profiling it, it appears that the one major bottleneck during decoding is the avcodec_send_packet() function, which can take up to 50ms per call. However, measuring the same call in FFmpeg shows strange behaviour:

Yes, you can embed images

(these are the times taken for each call to avcodec_send_packet() in milliseconds, when decoding a 4K 25fps VP9-encoded video.)

Basically, it seems that when FFmpeg uses this function, it only really takes any amount of time to complete every N calls, where N is the number of threads being used for decoding. However, both my test decoder and the actual product use 4 threads for decoding, and this doesn't happen; when using frame-based threading, the test decoder behaves like FFmpeg using only 1 thread. This would seem to indicate that we're not using multithreading at all, but we've still seen performance improvements by using more threads.

FFmpeg's results average out to being about twice as fast overall as our decoders, so clearly we're doing something wrong. I've been reading through FFmpeg's source to try to find any clues, but it's so far eluded me.

My question is: what's FFmpeg doing here that we're not? Alternatively, how can we increase the performance of our decoder?

Any help is greatly appreciated.

I've scavenged the source code from the archive and uploaded it to Github, hopefully it should survive now! — Jim, Sep 03 '21 at 09:13

score 4 · Answer 1 · answered Sep 02 '21 at 07:33

I was facing the same problem. It took me quite a while to figure out a solution which I want to share here for future references:

Enable multithreading for the decoder. Per default the decoder only uses one thread, depending on the decoder, multithreading can speed up decoding drastically.

Assuming you have AVFormatContext *format_ctx, a matching codec AVCodec* codec and AVCodecContext* codec_ctx (allocated using avcodec_alloc_context3).

Before opening the codec context (using avcodec_open2) you can configure multithreading. Check the capabilites of the codec in order to decide which kind of multithreading you can use:

// set codec to automatically determine how many threads suits best for the decoding job
codec_ctx->thread_count = 0;

if (codec->capabilities | AV_CODEC_CAP_FRAME_THREADS)
   codec_ctx->thread_type = FF_THREAD_FRAME;
else if (codec->capabilities | AV_CODEC_CAP_SLICE_THREADS)
   codec_ctx->thread_type = FF_THREAD_SLICE;
else
   codec_ctx->thread_count = 1; //don't use multithreading

Another speed-up I found out is the following: keep sending packets to the decoder (thats what avcodec_send_packet() is doing) until you get AVERROR(EAGAIN) as return value. This means the internal decoder buffers are full and you first need to collect the decoded frames (but remember to send this last packet again after the decoder is empty again). Now you can collect the decoded frames using avcodec_receive_frame until you get AVERROR(EAGAIN) again. Some decoders work way faster when they have mutiple frames queued for decoding (thats what the decoder does when codec_ctx->thread_type = FF_THREAD_FRAME is set).

We were already trying to use multithreading, but I think your comment about sending multiple times before receiving is insightful; we were previously only sending one packet per receive. FFmpeg wanting to collect enough packets for each of it's threads to work on would possibly explain the behaviour we saw. — Jim, Sep 03 '21 at 09:27

score 0 · Answer 2 · answered Mar 18 '19 at 10:28

avcodec_send_packet() and avcodec_receive_frame() are wrapper functions most important thing those do is calling selected codec's decode function and returns decoded frame (or error).

Try tweaking the codec options, for example, low latency may not give you what you want. And sometimes old api (I believe it still around) avcodec_decode_video2() outperforms newer one, you may try that too.

What's FFmpeg doing with avcodec_send_packet()?

2 Answers2