Programmatically creating a video using FFmpeg, using SDL's sprite screenshot BMP

Question

I have an animation/sprite developed in C++ on SDL2 libs (based on this answer). The bitmaps are saved to a certain path. They are of dimensions 640x480 and format is given by the SDL constant SDL_PIXELFORMAT_ARGB8888.

I have a second program written in C on top of FFmpeg libs, which reads one image from the above path (just one for the time being, will read the whole series when it works for just one). This does the following (in gist - skipping validation & comments for conciseness)

AVCodec *codec;
AVCodecContext *c = NULL;
int i, ret, x, y, got_output;
FILE *f;
AVFrame *frame;
AVPacket pkt;
uint8_t endcode[] = { 0, 0, 1, 0xb7 };

codec = avcodec_find_encoder(codec_id);
c = avcodec_alloc_context3(codec);
c->bit_rate = 400000;
/* resolution must be a multiple of two */
c->width = 640;
c->height = 480;
c->time_base = (AVRational ) { 1, 25 };
c->gop_size = 5;
c->max_b_frames = 1;
c->pix_fmt = AV_PIX_FMT_YUV420P;

av_opt_set(c->priv_data, "preset", "slow", 0);
avcodec_open2(c, codec, NULL);

fopen(filename, "wb");
frame = av_frame_alloc();
av_image_alloc(frame->data, frame->linesize, c->width, c->height, c->pix_fmt, 32);

for (i = 0; i < 25; ++i) {
    readSingleFile("/tmp/alok1/ss099.bmp", &frame->data);//Read the saved BMP into frame->data
    frame->pts = i;
    frame->width = 640;
    frame->height = 480;
    frame->format = -1;

    av_init_packet(&pkt);
    pkt.data = NULL; // packet data will be allocated by the encoder
    pkt.size = 0;
    ret = avcodec_encode_video2(c, &pkt, frame, &got_output);


    if (got_output) {
        printf("Write frame %3d (size=%5d)\n", i, pkt.size);
        fwrite(pkt.data, 1, pkt.size, f);
    }
    av_packet_unref(&pkt);
}
for (got_output = 1; got_output; i++) {
    fflush(stdout);

    ret = avcodec_encode_video2(c, &pkt, NULL, &got_output);
    if (ret < 0) {
        fprintf(stderr, "Error encoding frame\n");
        exit(1);
    }

    if (got_output) {
        printf("[DELAYED]Write frame %3d (size=%5d)\n", i, pkt.size);
        fwrite(pkt.data, 1, pkt.size, f);
        av_packet_unref(&pkt);
    }
}

fwrite(endcode, 1, sizeof(endcode), f);
//cleanup

As a result of the above code(which compiles without trouble), I can get a video which plays for 1 second - this part is working as expected. Problem is that the image seen is a green full screen like below.

The image that is being read using the readSingleImage(...) function is rendered by image viewer(linux, gwenview and okular) as follows:

Any pointers as to what could be going wrong?

Does `readSingleFile` does its job? In particular, does it convert your ARGB (packed 32-bit format) into YUV420P (planar 12-bit format)? Green field suggests that it doesn't because chroma components are all 0. — Andrey Turkin, Oct 27 '16 at 09:43
`readSingleFile()` just reads the file into `frame->data`. I apologise if this is a very stupid counter-question, but is FFmpeg not capable of making an implicit conversion when I've set the `frame->format` ? Because I've tried setting it to `AV_PIX_FMT_RGB32` to no avail. @AndreyTurkin — alok, Oct 27 '16 at 09:48
No it cannot do that. In fact, now that I'm thinking about it, those black dots at the video are probably BMP header. FFmpeg will not parse containers on itself too. Basically `frame->data` is expected to hold raw data in format specified by `frame->format`. It is your job to get it there. You can use demuxer/decoder facilities to get raw pictures from your images (they are treated as a single-frame videos), and you can use swscale facilities to convert between ARGB and YUV420P. But it is your job to spell out all these details to ffmpeg. — Andrey Turkin, Oct 27 '16 at 09:55
Ah! That makes a lot of sense. Those black dots were nagging in my mind too. Although, if raw data is set into `frame->data` with its format given in `frame->format`, it should do the conversion from ARGB to YUV420 right? As I see it, the programming logic will be iterating over the data twice. Is this two-pass logic really necessary? I'm only theorizing, I will change code and report back with an answer in a couple of hours. — alok, Oct 27 '16 at 10:28
OK, so I added the code from [link](http://stackoverflow.com/questions/16667687/how-to-convert-rgb-from-yuv420p-for-ffmpeg-encoder) and now I can see the image. The problem is that it is inverted, in both senses. In colour, and actually upside-down. I figure that colour inversion is owing to wrong input format, but what's the flip-image about? — alok, Oct 27 '16 at 10:44
Regarding format - there is no implicit conversion in encoder, and there is no two-pass logic here (whatever that might be). You specify picture format/dimensions beforehand, and then you must pass frames in that same format and with same dimensions. — Andrey Turkin, Oct 27 '16 at 11:51
Color inversion is probably because you specified something like RGB instead of BGR (24-bit RGB in Windows lingo is actually BGR). Image is flipped because of yet another BMP (or Windows) quirk where it uses bottom-up images instead of top-bottom images used by everyone else including ffmpeg. Maybe you can use some hackery with pointing data to the end of an image and using negative linesizes to solve this without extra "flip" conversion (not sure if that would work). — Andrey Turkin, Oct 27 '16 at 11:54
I think color inversion is a lesser issue, I'll just have to play around with it until I get it right, that's probably all there is to it. The BMP is generated on linux, by a C program using SDL libraries using the API `SDL_CreateRGBSurface` in combination with `SDL_RenderReadPixels(gRenderer, NULL, SDL_PIXELFORMAT_ARGB8888, sshot->pixels, sshot->pitch)`. I've tried around 20 formats from FFmpeg's enums, no luck there so far. Any guesses on this? — alok, Oct 27 '16 at 12:26
Also, by the two-pass thing, I meant that we're allocating the frame, doing RGB->YUV conversion first, then YUV encoding for adding to video container (if I'm not wrong). But come to think of it, having it handled inside the FFmpeg library or having the devleoper write those two lines of code aren't going to change the performance in any way are they? — alok, Oct 27 '16 at 12:28
@AndreyTurkin Please post comment #3(upvoted) as an answer, I'll accept it right away. — alok, Oct 27 '16 at 12:28

score 2 · Accepted Answer · edited May 23 '17 at 12:06

To sum up the comments:

Encoder expects raw image data in format specified upon opening; it will not try to convert anything
Colorspace/format conversion has to be done manually; use swscale
if you are using Windows API to load image: it uses BGR, not RGB
BMP files usually but not always store image bottom-up as opposed to top-down used by FFmpeg; if it is bottom-up then it has to be flipped (there might be a way to do that without much or any performance hit if combined with colorspace conversion).
Also keep an eye on linesizes. Each line in an image can occupy more bytes than its width. This applies both to images allocated by ffmpeg and to images loaded from BMP - one has to be careful to always provide valid linesizes to each API call.

In addition to the above, below are "must-reads" for the final solution:
1. Taking a screenshot with SDL
2. RGB to YUV conversion
3. FFmpeg-related source code was written with this as base

Programmatically creating a video using FFmpeg, using SDL's sprite screenshot BMP

1 Answers1