What to pass to avcodec_decode_video2 for H.264 Transport Stream?

Question

I want to decode H.264 video from a collection of MPEG-2 Transport Stream packets but I am not clear what to pass to avcodec_decode_video2

The documentation says to pass "the input AVPacket containing the input buffer."

But what should be in the input buffer?

A PES packet will be spread across the payload portion of several TS packets, with NALU(s) inside the PES. So pass a TS fragment? The entire PES? PES payload only?

This Sample Code mentions:

BUT some other codecs (msmpeg4, mpeg4) are inherently frame based, so you must call them with all the data for one frame exactly. You must also initialize 'width' and 'height' before initializing them.

But I can find no info on what "all the data" means...

Passing a fragment of a TS packet payload is not working:

AVPacket avDecPkt;
av_init_packet(&avDecPkt);
avDecPkt.data = inbuf_ptr;
avDecPkt.size = esBufSize;

len = avcodec_decode_video2(mpDecoderContext, mpFrameDec, &got_picture, &avDecPkt);
if (len < 0)
{
    printf("  TS PKT #%.0f. Error decoding frame #%04d [rc=%d '%s']\n",
        tsPacket.pktNum, mDecodedFrameNum, len, av_make_error_string(errMsg, 128, len));
    return;
}

output

[h264 @ 0x81cd2a0] no frame!
TS PKT #2973. Error decoding frame #0001 [rc=-1094995529 'Invalid data found when processing input']

EDIT

Using the excellent hits from WLGfx, I made this simple program to try decoding TS packets. As input, I prepared a file containing only TS packets from the Video PID.

It feels close but I don't know how to set up the FormatContext. The code below segfaults at av_read_frame() (and internally at ret = s->iformat->read_packet(s, pkt)). s->iformat is zero.

Suggestions?

EDIT II - Sorry, for got post source code ** **EDIT III - Sample code updated to simulate reading TS PKT Queue

/*
 * Test program for video decoder
 */

#include <stdio.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern "C" {

#ifdef __cplusplus
    #define __STDC_CONSTANT_MACROS
    #ifdef _STDINT_H
        #undef _STDINT_H
    #endif
    #include <stdint.h>
#endif
}

extern "C" {
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libswscale/swscale.h"
#include "libavutil/imgutils.h"
#include "libavutil/opt.h"
}


class VideoDecoder
{
public:
    VideoDecoder();
    bool rcvTsPacket(AVPacket &inTsPacket);

private:
    AVCodec         *mpDecoder;
    AVCodecContext  *mpDecoderContext;
    AVFrame         *mpDecodedFrame;
    AVFormatContext *mpFmtContext;

};

VideoDecoder::VideoDecoder()
{
    av_register_all();

    // FORMAT CONTEXT SETUP
    mpFmtContext = avformat_alloc_context();
    mpFmtContext->flags = AVFMT_NOFILE;
    // ????? WHAT ELSE ???? //

    // DECODER SETUP
    mpDecoder = avcodec_find_decoder(AV_CODEC_ID_H264);
    if (!mpDecoder)
    {
        printf("Could not load decoder\n");
        exit(11);
    }

    mpDecoderContext = avcodec_alloc_context3(NULL);
    if (avcodec_open2(mpDecoderContext, mpDecoder, NULL) < 0)
    {
        printf("Cannot open decoder context\n");
        exit(1);
    }

    mpDecodedFrame = av_frame_alloc();
}

bool
VideoDecoder::rcvTsPacket(AVPacket &inTsPkt)
{
    bool ret = true;

    if ((av_read_frame(mpFmtContext, &inTsPkt)) < 0)
    {
        printf("Error in av_read_frame()\n");
        ret = false;
    }
    else
    {
        // success.  Decode the TS packet
        int got;
        int len = avcodec_decode_video2(mpDecoderContext, mpDecodedFrame, &got, &inTsPkt);
        if (len < 0)
            ret = false;

        if (got)
            printf("GOT A DECODED FRAME\n");
    }

    return ret;
}

int
main(int argc, char **argv)
{
    if (argc != 2)
    {
        printf("Usage: %s tsInFile\n", argv[0]);
        exit(1);
    }

    FILE *tsInFile = fopen(argv[1], "r");
    if (!tsInFile)
    {
        perror("Could not open TS input file");
        exit(2);
    }

    unsigned int tsPktNum = 0;
    uint8_t      tsBuffer[256];
    AVPacket     tsPkt;
    av_init_packet(&tsPkt);

    VideoDecoder vDecoder;

    while (!feof(tsInFile))
    {
        tsPktNum++;

        tsPkt.size = 188;
        tsPkt.data = tsBuffer;
        fread(tsPkt.data, 188, 1, tsInFile);

        vDecoder.rcvTsPacket(tsPkt);
    }
}

Incoming packets will have a stream ID, for audio, video, subtitles and data. Once you've determined and created a codec context for a video stream, all you need to do is to pass the packets to you own decode function. The best source of information is the source code to ffplay... — WLGfx, Nov 24 '16 at 16:54
Thanks. The TS packets are already constrained to a single PID, containing only the video. It is H.264 so I used AV_CODEC_ID_H264 as the decoder. When you say "pass the packets", which ones? complete TS packets or re-assembled PES packets? I'll check out ffplay. — Danny, Nov 25 '16 at 00:41

WLGfx · Answer 1 · 2016-11-29T11:15:17.187

I've got some code snippets that might help you out as I've been working with MPEG-TS also.

Starting with my packet thread which checks each packet against the stream ID's which I've already found and got the codec contexts:

void *FFMPEG::thread_packet_function(void *arg) {
    FFMPEG *ffmpeg = (FFMPEG*)arg;
    for (int c = 0; c < MAX_PACKETS; c++)
        ffmpeg->free_packets[c] = &ffmpeg->packet_list[c];
    ffmpeg->packet_pos = MAX_PACKETS;

    Audio.start_decoding();
    Video.start_decoding();
    Subtitle.start_decoding();

    while (!ffmpeg->thread_quit) {
        if (ffmpeg->packet_pos != 0 &&
                Audio.okay_add_packet() &&
                Video.okay_add_packet() &&
                Subtitle.okay_add_packet()) {

            pthread_mutex_lock(&ffmpeg->packet_mutex); // get free packet
            AVPacket *pkt = ffmpeg->free_packets[--ffmpeg->packet_pos]; // pre decrement
            pthread_mutex_unlock(&ffmpeg->packet_mutex);

            if ((av_read_frame(ffmpeg->fContext, pkt)) >= 0) { // success
                int id = pkt->stream_index;
                if (id == ffmpeg->aud_stream.stream_id) Audio.add_packet(pkt);
                else if (id == ffmpeg->vid_stream.stream_id) Video.add_packet(pkt);
                else if (id == ffmpeg->sub_stream.stream_id) Subtitle.add_packet(pkt);
                else { // unknown packet
                    av_packet_unref(pkt);

                    pthread_mutex_lock(&ffmpeg->packet_mutex); // put packet back
                    ffmpeg->free_packets[ffmpeg->packet_pos++] = pkt;
                    pthread_mutex_unlock(&ffmpeg->packet_mutex);

                    //LOGI("Dumping unknown packet, id %d", id);
                }
            } else {
                av_packet_unref(pkt);

                pthread_mutex_lock(&ffmpeg->packet_mutex); // put packet back
                ffmpeg->free_packets[ffmpeg->packet_pos++] = pkt;
                pthread_mutex_unlock(&ffmpeg->packet_mutex);

                //LOGI("No packet read");
            }
        } else { // buffers full so yield
            //LOGI("Packet reader on hold: Audio-%d, Video-%d, Subtitle-%d",
            //  Audio.packet_pos, Video.packet_pos, Subtitle.packet_pos);
            usleep(1000);
            //sched_yield();
        }
    }
    return 0;
}

Each decoder for audio, video and subtitles have their own threads which receive the packets from the above thread in ring buffers. I've had to separate the decoders into their own threads because CPU usage was increasing when I started using the deinterlace filter.

My video decoder reads the packets from the buffers and when it has finished with the packet sends it back to be unref'd and can be used again. Balancing the packet buffers doesn't take that much time once everything is running.

Here's the snipped from my video decoder:

void *VideoManager::decoder(void *arg) {
    LOGI("Video decoder started");
    VideoManager *mgr = (VideoManager *)arg;
    while (!ffmpeg.thread_quit) {
        pthread_mutex_lock(&mgr->packet_mutex);
        if (mgr->packet_pos != 0) {
            // fetch first packet to decode
            AVPacket *pkt = mgr->packets[0];

            // shift list down one
            for (int c = 1; c < mgr->packet_pos; c++) {
                mgr->packets[c-1] = mgr->packets[c];
            }
            mgr->packet_pos--;
            pthread_mutex_unlock(&mgr->packet_mutex); // finished with packets array

            int got;
            AVFrame *frame = ffmpeg.vid_stream.frame;
            avcodec_decode_video2(ffmpeg.vid_stream.context, frame, &got, pkt);
            ffmpeg.finished_with_packet(pkt);
            if (got) {
#ifdef INTERLACE_ALL
                if (!frame->interlaced_frame) mgr->add_av_frame(frame, 0);
                else {
                    if (!mgr->filter_initialised) mgr->init_filter_graph(frame);
                    av_buffersrc_add_frame_flags(mgr->filter_src_ctx, frame, AV_BUFFERSRC_FLAG_KEEP_REF);
                    int c = 0;
                    while (true) {
                        AVFrame *filter_frame = ffmpeg.vid_stream.filter_frame;
                        int result = av_buffersink_get_frame(mgr->filter_sink_ctx, filter_frame);
                        if (result == AVERROR(EAGAIN) ||
                                result == AVERROR(AVERROR_EOF) ||
                                result < 0)
                            break;
                        mgr->add_av_frame(filter_frame, c++);
                        av_frame_unref(filter_frame);
                    }
                    //LOGI("Interlaced %d frames, decode %d, playback %d", c, mgr->decode_pos, mgr->playback_pos);
                }
#elif defined(INTERLACE_HALF)
                if (!frame->interlaced_frame) mgr->add_av_frame(frame, 0);
                else {
                    if (!mgr->filter_initialised) mgr->init_filter_graph(frame);
                    av_buffersrc_add_frame_flags(mgr->filter_src_ctx, frame, AV_BUFFERSRC_FLAG_KEEP_REF);
                    int c = 0;
                    while (true) {
                        AVFrame *filter_frame = ffmpeg.vid_stream.filter_frame;
                        int result = av_buffersink_get_frame(mgr->filter_sink_ctx, filter_frame);
                        if (result == AVERROR(EAGAIN) ||
                                result == AVERROR(AVERROR_EOF) ||
                                result < 0)
                            break;
                        mgr->add_av_frame(filter_frame, c++);
                        av_frame_unref(filter_frame);
                    }
                    //LOGI("Interlaced %d frames, decode %d, playback %d", c, mgr->decode_pos, mgr->playback_pos);
                }
#else
                mgr->add_av_frame(frame, 0);
#endif
            }
            //LOGI("decoded video packet");
        } else {
            pthread_mutex_unlock(&mgr->packet_mutex);
        }
    }
    LOGI("Video decoder ended");
}

As you can see, I'm using a mutex when passing packets back and forth.

Once a frame has been got I just copy the YUV buffers from the frame for later use into another buffer list. I don't convert the YUV, I use a shader which converts the YUV to RGB on the GPU.

The next snippet adds my decoded frame to my buffer list. This may help understand how to deal with the data.

void VideoManager::add_av_frame(AVFrame *frame, int field_num) {
    int y_linesize = frame->linesize[0];
    int u_linesize = frame->linesize[1];

    int hgt = frame->height;

    int y_buffsize = y_linesize * hgt;
    int u_buffsize = u_linesize * hgt / 2;

    int buffsize = y_buffsize + u_buffsize + u_buffsize;

    VideoBuffer *buffer = &buffers[decode_pos];

    if (ffmpeg.is_network && playback_pos == decode_pos) { // patched 25/10/16 wlgfx
        buffer->used = false;
        if (!buffer->data) buffer->data = (char*)mem.alloc(buffsize);
        if (!buffer->data) {
            LOGI("Dropped frame, allocation error");
            return;
        }
    } else if (playback_pos == decode_pos) {
        LOGI("Dropped frame, ran out of decoder frame buffers");
        return;
    } else if (!buffer->data) {
        buffer->data = (char*)mem.alloc(buffsize);
        if (!buffer->data) {
            LOGI("Dropped frame, allocation error.");
            return;
        }
    }

    buffer->y_frame = buffer->data;
    buffer->u_frame = buffer->y_frame + y_buffsize;
    buffer->v_frame = buffer->y_frame + y_buffsize + u_buffsize;

    buffer->wid = frame->width;
    buffer->hgt = hgt;

    buffer->y_linesize = y_linesize;
    buffer->u_linesize = u_linesize;

    int64_t pts = av_frame_get_best_effort_timestamp(frame);
    buffer->pts = pts;
    buffer->buffer_size = buffsize;

    double field_add = av_q2d(ffmpeg.vid_stream.context->time_base) * field_num;
    buffer->frame_time = av_q2d(ts_stream) * pts + field_add;

    memcpy(buffer->y_frame, frame->data[0], (size_t) (buffer->y_linesize * buffer->hgt));
    memcpy(buffer->u_frame, frame->data[1], (size_t) (buffer->u_linesize * buffer->hgt / 2));
    memcpy(buffer->v_frame, frame->data[2], (size_t) (buffer->u_linesize * buffer->hgt / 2));

    buffer->used = true;
    decode_pos = (++decode_pos) % MAX_VID_BUFFERS;

    //if (field_num == 0) LOGI("Video %.2f, %d - %d",
    //        buffer->frame_time - Audio.pts_start_time, decode_pos, playback_pos);
}

If there's anything else that I may be able to help with just give me a shout. :-)

EDIT:

The snippet how I open my video stream context which automatically determines the codec, whether it is h264, mpeg2, or another:

void FFMPEG::open_video_stream() {
    vid_stream.stream_id = av_find_best_stream(fContext, AVMEDIA_TYPE_VIDEO,
                                                -1, -1, &vid_stream.codec, 0);
    if (vid_stream.stream_id == -1) return;

    vid_stream.context = fContext->streams[vid_stream.stream_id]->codec;

    if (!vid_stream.codec || avcodec_open2(vid_stream.context,
            vid_stream.codec, NULL) < 0) {
        vid_stream.stream_id = -1;
        return;
    }

    vid_stream.frame = av_frame_alloc();
    vid_stream.filter_frame = av_frame_alloc();
}

EDIT2:

This is how I've opened the input stream, whether it be file or URL. The AVFormatContext is the main context for the stream.

bool FFMPEG::start_stream(char *url_, float xtrim, float ytrim, int gain) {
    aud_stream.stream_id = -1;
    vid_stream.stream_id = -1;
    sub_stream.stream_id = -1;

    this->url = url_;
    this->xtrim = xtrim;
    this->ytrim = ytrim;
    Audio.volume = gain;

    Audio.init();
    Video.init();

    fContext = avformat_alloc_context();

    if ((avformat_open_input(&fContext, url_, NULL, NULL)) != 0) {
        stop_stream();
        return false;
    }

    if ((avformat_find_stream_info(fContext, NULL)) < 0) {
        stop_stream();
        return false;
    }

    // network stream will overwrite packets if buffer is full

    is_network =  url.substr(0, 4) == "udp:" ||
                  url.substr(0, 4) == "rtp:" ||
                  url.substr(0, 5) == "rtsp:" ||
                  url.substr(0, 5) == "http:";  // added for wifi broadcasting ability

    // determine if stream is audio only

    is_mp3 = url.substr(url.size() - 4) == ".mp3";

    LOGI("Stream: %s", url_);

    if (!open_audio_stream()) {
        stop_stream();
        return false;
    }

    if (is_mp3) {
        vid_stream.stream_id = -1;
        sub_stream.stream_id = -1;
    } else {
        open_video_stream();
        open_subtitle_stream();

        if (vid_stream.stream_id == -1) { // switch to audio only
            close_subtitle_stream();
            is_mp3 = true;
        }
    }

    LOGI("Audio: %d, Video: %d, Subtitle: %d",
            aud_stream.stream_id,
            vid_stream.stream_id,
            sub_stream.stream_id);

    if (aud_stream.stream_id != -1) {
        LOGD("Audio stream time_base {%d, %d}",
            aud_stream.context->time_base.num,
            aud_stream.context->time_base.den);
    }

    if (vid_stream.stream_id != -1) {
        LOGD("Video stream time_base {%d, %d}",
            vid_stream.context->time_base.num,
            vid_stream.context->time_base.den);
    }

    LOGI("Starting packet and decode threads");

    thread_quit = false;

    pthread_create(&thread_packet, NULL, &FFMPEG::thread_packet_function, this);

    Display.set_overlay_timout(3.0);

    return true;
}

EDIT: (constructing an AVPacket)

Construct an AVPacket to send to the decoder...

AVPacket packet;
av_init_packet(&packet);
packet.data = myTSpacketdata; // pointer to the TS packet
packet.size = 188;

You should be able to reuse the packet. And it might need unref'ing.

Wow, that looks great. Thanks. Just to confirm, your "packet_list" and "packets" are MPEG-TS packets? With 0x47 sync byte, PID, TS header, etc? — Danny, Nov 25 '16 at 12:47
Yes they are the packets received from av_read_frame(). There's no need to check sync byte, only the stream ID as in the packet thread. — WLGfx, Nov 25 '16 at 12:52
NB: I noticed a difference between ffmpeg 3.1.4 and 3.2. The early version gave 1 to 1 packets for the decoder. 3.2 have smaller packets as ffmpeg has now decoupled and deprecated the avcodec_decode_video2() function into 2 separate calls. Either way, this makes no difference because you just check the &got variable. — WLGfx, Nov 25 '16 at 12:56
And another note as you mentioned about the raw TS packets. Yes you can feed the decoder with those packets too just so long as you identify the correct stream ID. — WLGfx, Nov 25 '16 at 13:00
I've used your code to make a simple test program. To simulate the in-memory transfer, I read from a file containing **only** TS packets of the Video PID. I think I'm close but don't know how to set up the FormatContext. Sample cost posted above... any ideas? — Danny, Nov 25 '16 at 16:29
I've added another edit as to how I've opened up the stream. This will work with raw TS files. — WLGfx, Nov 25 '16 at 16:37
Thanks for that. Sorry, I didn't post the sample code, was late. Check it out in the edit. Problem is my actual program doesn't read from a file. I have a queue of 188B TS packets which I need to decode. So program loop pulls a TS pkt from the queue, decodes, repeats. The queue of pkts is created elsewhere and I have no control over that. So how to feed avcodec_decode_video with those TS packets? — Danny, Nov 26 '16 at 02:07
How are you getting the TS packets? If you can connect ffmpeg direct to that stream, it will make life so much easier. My project connects to a UDP IPTV broadcast, ie UDP://224.224.1.1:5000 — WLGfx, Nov 26 '16 at 17:50
I've just looked at your sample code. Yes, you can feed the packets, but, also you should be able to connect ffmpeg to the raw file stream too. If the latter, you can define the codec before connecting to the stream. Use your way first though as you're almost there. — WLGfx, Nov 26 '16 at 17:53
Thanks for the response. Actually, I don't have a file stream, my program reads an in-memory queue coming from another process/thread. The sample code is just simulating reading the TS pkt queue by reading a file, I've updated the sample code to be more clear. Feeding the packets as above doesn't work. Tons of errors "no frame!, and "illegal data". So I'm not doing something right. — Danny, Nov 29 '16 at 01:17
Ah yes, you have to manually construct and AVPacket to send to ffmpeg. I've updated my answer. — WLGfx, Nov 29 '16 at 11:15
Thanks again. That's what I did but I'm missing how to set the AvFormatContext. Your example shows setting as a url. How would you do it if reading from a TS packet queue in my example code? i.e. VideoDecoder::rcvTsPacket(AVPacket &inTsPkt). I find almost no documentation about AvFormatContext :-( — Danny, Dec 02 '16 at 02:28
That's something I've not yet done. However, I did find something on tinterweb for creating an avio_open stream for raw data. Something I'll need soon too. These two pages on SO, http://stackoverflow.com/questions/5964142/raw-h264-frames-in-mpegts-container-using-libavcodec and http://stackoverflow.com/questions/23354915/how-to-write-h264-raw-stream-into-mp4-using-ffmpeg-directly — WLGfx, Dec 02 '16 at 09:03
Yes, avio is the way to go... Here's a couple more examples. https://www.codeproject.com/tips/489450/creating-custom-ffmpeg-io-context and http://www.ffmpeg.org/doxygen/trunk/doc_2examples_2avio_reading_8c-example.html Let me know how you get on please... — WLGfx, Dec 02 '16 at 09:15

score 0 · Answer 2 · answered Nov 24 '16 at 20:43

0

You must first use the avcodec library to get the compressed frames out of the file. Then you can decode them using avcodec_decode_video2. look at this tutorial http://dranger.com/ffmpeg/

answered Nov 24 '16 at 20:43

szatmary

29,969
8
44
57

Thanks but I don't have a file. I have a collection of MPEG-2 Transport Stream packets... – Danny Nov 25 '16 at 00:42
No problem Just use different io context to read from memory instead of a file. – szatmary Nov 25 '16 at 01:06
The TS packets arrive one by one over time. So do you mean I should extract the PES packet from the collection of TS packets and call avcodec_decode_video2 once I have the the reassembled PES packet? Or do I strip the PES header and pass the ES starting at the start code? Thanks! – Danny Nov 25 '16 at 01:21

What to pass to avcodec_decode_video2 for H.264 Transport Stream?

2 Answers2