1

Iam using OpenH264 as encoder and I want to mux its output into a playable mp4 using libmp4v2

The resulting .mp4 only work partially. It is playable in VLC and MPC-HC but not in Windows Media Player nor the Windows 10 "Movie and Tv" App.
My goal is it that the file works with all these players.

Both Windows players are telling my they don't know the codec so they can't play it back: enter image description here
This is not really true since I can play a manually muxed file using the same h264 bitstream by using FFmpeg from the cli:

ffmpeg -i "testenc.h264" -c:v copy -f mp4 "output.mp4"

According to this knowlege I think my encoding process works fine and the problem is located in the muxing procedure.

Edit: Thanks to Rudolfs Bundulis answer who pointed out that the SPS/PPS data is missing I was able to restructure my code. It now trys to include the missing data by analysing the encoders bitstream and calling MP4AddH264SequenceParameterSet or MP4AddH264PictureParameterSet when necessary. But still without success.

My full code:

#include "stdafx.h"
#include <iostream>
#include <stdio.h>
#include <chrono>
#include "mp4v2/mp4v2.h"
#include "codec_api.h"

#define WIDTH 1280
#define HEIGHT 960
#define DURATION MP4_INVALID_DURATION
#define NAL_SPS 1
#define NAL_PPS 2
#define NAL_I 3
#define NAL_P 4

using namespace std;
using namespace chrono;

/* Just some dummy data to see artifacts ect */
void prepareFrame(int i, SSourcePicture* pic) {
    for (int y = 0; y<HEIGHT; y++) {
        for (int x = 0; x<WIDTH; x++) {
            pic->pData[0][y * WIDTH + x] = x + y + i * 3;
        }
    }

    for (int y = 0; y<HEIGHT / 2; y++) {
        for (int x = 0; x<WIDTH / 2; x++) {
            pic->pData[1][y * (WIDTH / 2) + x] = 128 + y + i * 2;
            pic->pData[2][y * (WIDTH / 2) + x] = 64 + x + i * 5;
        }
    }
    pic->uiTimeStamp = (i + 1) * 1000 / 75;
}

void printHex(const unsigned char* arr, int len) {
    for (int i = 0; i < len; i++) {
        if (arr[i] < 16) {
            cout << "0";
        }
        cout << hex << (int)arr[i] << " ";
    }
    cout << endl;
}

void mp4Encode(MP4FileHandle mp4Handle, MP4TrackId track, uint8_t * bitstream, int length) {

    int index = -1;

    if (bitstream[0] == 0 && bitstream[1] == 0 && bitstream[2] == 0 && bitstream[3] == 1 && bitstream[4] == 0x67) {
        index = NAL_SPS;
    }
    if (bitstream[0] == 0 && bitstream[1] == 0 && bitstream[2] == 0 && bitstream[3] == 1 && bitstream[4] == 0x68) {
        index = NAL_PPS;
    }
    if (bitstream[0] == 0 && bitstream[1] == 0 && bitstream[2] == 0 && bitstream[3] == 1 && bitstream[4] == 0x65) {
        index = NAL_I;
    }
    if (bitstream[0] == 0 && bitstream[1] == 0 && bitstream[2] == 0 && bitstream[3] == 1 && bitstream[4] == 0x61) {
        index = NAL_P;
    }

    switch (index) {
    case NAL_SPS:
        cout << "Detected SPS" << endl;
        MP4AddH264SequenceParameterSet(mp4Handle, track, bitstream + 4, length - 4);
        break;
    case NAL_PPS:
        cout << "Detected PPS" << endl;
        MP4AddH264PictureParameterSet(mp4Handle, track, bitstream + 4, length - 4);
        break;
    case NAL_I:
    {
        cout << "Detected I" << endl;
        uint8_t * IFrameData = (uint8_t *) malloc(length + 1);
        IFrameData[0] = (length - 3) >> 24;
        IFrameData[1] = (length - 3) >> 16;
        IFrameData[2] = (length - 3) >> 8;
        IFrameData[3] = (length - 3) & 0xff;

        memcpy(IFrameData + 4, bitstream + 3, length - 3);

        if (!MP4WriteSample(mp4Handle, track, IFrameData, length + 1, DURATION, 0, 1)) {
            cout << "Error when writing sample" << endl;
            system("pause");
            exit(1);
        }
        free(IFrameData);

        break;
    }
    case NAL_P:
    {
        cout << "Detected P" << endl;
        bitstream[0] = (length - 4) >> 24;
        bitstream[1] = (length - 4) >> 16;
        bitstream[2] = (length - 4) >> 8;
        bitstream[3] = (length - 4) & 0xff;

        if (!MP4WriteSample(mp4Handle, track, bitstream, length, DURATION, 0, 1)) {
            cout << "Error when writing sample" << endl;
            system("pause");
            exit(1);
        }
        break;
    }
    }
    if (index == -1) {
        cout << "Could not detect nal type" << endl;
        system("pause");
        exit(1);
    }
}

int main()
{
    //just to measure performance
    high_resolution_clock::time_point time = high_resolution_clock::now(); 

    //Create MP4
    MP4FileHandle mp4Handle = MP4Create("test.mp4", 0);
    MP4SetTimeScale(mp4Handle, 90000);

    //Create filestream for binary h264 output for testing 
    FILE* targetFile;
    targetFile = fopen("testenc.h264", "wb");

    if (!targetFile) {
        cout << "failed to create file" << endl;
        system("pause");
        return 1;
    }

    ISVCEncoder *encoder;
    int rv = WelsCreateSVCEncoder(&encoder);

    //Encoder params
    SEncParamExt param;
    encoder->GetDefaultParams(&param);
    param.iUsageType = CAMERA_VIDEO_REAL_TIME;
    param.fMaxFrameRate = 75.f;
    param.iLtrMarkPeriod = 75;
    param.iPicWidth = WIDTH;
    param.iPicHeight = HEIGHT;
    param.iTargetBitrate = 40000000;
    param.bEnableDenoise = false;
    param.iSpatialLayerNum = 1;
    param.bUseLoadBalancing = false;
    param.bEnableSceneChangeDetect = false;
    param.bEnableBackgroundDetection = false;
    param.bEnableAdaptiveQuant = false;
    param.bEnableFrameSkip = false; 
    param.iMultipleThreadIdc = 16;
    //param.uiIntraPeriod = 10;

    for (int i = 0; i < param.iSpatialLayerNum; i++) {
        param.sSpatialLayers[i].iVideoWidth = WIDTH >> (param.iSpatialLayerNum - 1 - i);
        param.sSpatialLayers[i].iVideoHeight = HEIGHT >> (param.iSpatialLayerNum - 1 - i);
        param.sSpatialLayers[i].fFrameRate = 75.f;
        param.sSpatialLayers[i].iSpatialBitrate = param.iTargetBitrate;
        param.sSpatialLayers[i].uiProfileIdc = PRO_BASELINE;
        param.sSpatialLayers[i].uiLevelIdc = LEVEL_4_2;
        param.sSpatialLayers[i].iDLayerQp = 42;

        SSliceArgument sliceArg;
        sliceArg.uiSliceMode = SM_FIXEDSLCNUM_SLICE;
        sliceArg.uiSliceNum = 16;

        param.sSpatialLayers[i].sSliceArgument = sliceArg;      
    }

    param.uiMaxNalSize = 1500;
    param.iTargetBitrate *= param.iSpatialLayerNum;
    encoder->InitializeExt(&param);
    int videoFormat = videoFormatI420;
    encoder->SetOption(ENCODER_OPTION_DATAFORMAT, &videoFormat);

    MP4TrackId track = MP4AddH264VideoTrack(mp4Handle, 90000, 90000/25, WIDTH, HEIGHT, 66, 192, 42, 3);
    MP4SetVideoProfileLevel(mp4Handle, 0x7f);

    SFrameBSInfo info;
    memset(&info, 0, sizeof(SFrameBSInfo));
    SSourcePicture pic;
    memset(&pic, 0, sizeof(SSourcePicture));
    pic.iPicWidth = WIDTH;
    pic.iPicHeight = HEIGHT;
    pic.iColorFormat = videoFormatI420;
    pic.iStride[0] = pic.iPicWidth;
    pic.iStride[1] = pic.iStride[2] = pic.iPicWidth >> 1;
    int frameSize = WIDTH * HEIGHT * 3 / 2;
    pic.pData[0] = new unsigned char[frameSize];
    pic.pData[1] = pic.pData[0] + WIDTH * HEIGHT;
    pic.pData[2] = pic.pData[1] + (WIDTH * HEIGHT >> 2);
    for (int num = 0; num<75; num++) {
        cout << "-------FRAME " << dec << num << "-------" << endl;
        prepareFrame(num, &pic);
        rv = encoder->EncodeFrame(&pic, &info);
        if (!rv == cmResultSuccess) {
            cout << "encode failed" << endl;
            continue;
        }
        if (info.eFrameType != videoFrameTypeSkip) {

            for (int i = 0; i < info.iLayerNum; ++i) {
                int len = 0;
                const SLayerBSInfo& layerInfo = info.sLayerInfo[i];
                for (int j = 0; j < layerInfo.iNalCount; ++j) {
                    cout << "Layer: " << dec << i << "| Nal: " << j << endl << "Hex: ";
                    printHex(info.sLayerInfo[i].pBsBuf + len, 20);
                    mp4Encode(mp4Handle, track, info.sLayerInfo[i].pBsBuf + len, layerInfo.pNalLengthInByte[j]);
                    len += layerInfo.pNalLengthInByte[j];
                }
                //mp4Encode(mp4Handle, track, info.sLayerInfo[i].pBsBuf, len);
            }
            //fwrite(info.sLayerInfo[0].pBsBuf, 1, len, targetFile);

        }
    }
    int res = 0;
    encoder->GetOption(ENCODER_OPTION_PROFILE, &res);
    cout << res << endl;
    fflush(targetFile);
    fclose(targetFile);

    encoder->Uninitialize();
    WelsDestroySVCEncoder(encoder);
    //Close MP4
    MP4Close(mp4Handle);

    cout << "done in: ";
    cout << duration_cast<milliseconds>(high_resolution_clock::now() - time).count() << endl;
    system("pause");
    return 0;
}
Crigges
  • 1,083
  • 2
  • 15
  • 28
  • Zippyshare is a great place to get redirected to a malware site. – Retired Ninja Mar 21 '18 at 04:12
  • However I just changed it to Google Drive, till the next one complains. – Crigges Mar 21 '18 at 04:31
  • Please checkout https://meta.stackoverflow.com/questions/275358/what-is-the-policy-for-linking-to-zip-files-on-file-sharing-websites I think linking theese files here is legit. – Crigges Mar 21 '18 at 04:46
  • 1
    You might try downloading something like MediaInfo and comparing the values in your broken video to your good video. It might give you an idea what to change, and it'll let you see what the changes actually do to the end result. – Retired Ninja Mar 21 '18 at 13:58
  • Just tryed that, but without success. I can't find a proper way to change some flags / information and when using a Hex Editor I brick the whole file by breaking some unknown offsets. However thanks for your help – Crigges Mar 21 '18 at 20:19

1 Answers1

1

You can use MP4Box from GPAC to analyze the MP4 box layouts of both files. enter image description here As seen here, the bad file is missing the SPS/PPS data in the avcC box. The same NAL units are most likely stored in the NAL units as well, but the specification requires them to be also present in the avcC box (some players handle SPS/PPS inlined in the stream but that is a bad practice since it breaks seeking and what not since you don't know which sample groups reference which parameter sets upfront).

A quick google search for libmp4v2 gave me this example which shows how to actually call MP4AddH264SequenceParameterSet/MP4AddH264PictureParameterSet to provide the SPS/PPS, while you only call MP4WriteSample which could be the issue.

My subjective opinion - I have never used libmp4v2 but if you don't know how to use it too just use ffmpeg instead - more examples and the community will be bigger. Muxing H.264 into mp4 is quite simple, again lots of examples on the internet.

SUMMARY

  1. MP4 requires the SPS/PPS information to be in the avcC box - some players may be able to decode the stream if these units are put inline with the samples, but to conform to the specification, one should always have the avcC box present, otherwise a player is free to fail to play the stream.
  2. Depending on the library used, there may be different techniques how to signal the SPS/PPS to the muxer, but as seen here with libmp4v2, one must use P4AddH264SequenceParameterSet/MP4AddH264PictureParameterSet. To obtain the SPS/PPS data one should parse the bitstream. This varies depending on the bitstream format (if annex b format with start codes or avcc format with interleaved lengths are used - see this for more info). When the SPS/PPS info is extracted it should be passed to the muxing library.
  3. Handle SPS/PPS change with care. The specification actually states that you can have multiple stsd stream description boxes and then reference them, but as far as I recall, Windows Media Player handles this poorly, so if possible stick to a single SPS/PPS set. One should be able to configure the encoder not to emit duplicate SPS/PPS entries on each keyframe.
Rudolfs Bundulis
  • 11,636
  • 6
  • 33
  • 71
  • Thank you for your answer. I found out that calling MP4AddH264SequenceParameterSet with some dummy data removed the error from the default players but since the data is wrong there is no picture. The example is great and I came already across it. But the main issue is that it is using x264 as encoder and therefore Iam missing the used parameter. Do u have any idea where OpenH264 stores them? You suggestion of using ffmpeg is right and I would prefer to use ffmpeg but my goal is it to create a h.264/mp4 encoder based on BSD license. – Crigges Mar 21 '18 at 21:25
  • 1
    @Crigges - yeah, you should look at the encoded NAL units and distinguish by type what to pass (the example actually does that by setting the `index` variable). I mean if we start discussing the NAL unit syntax, types and etc. its gonna take a while. Are you familiar with H.264? – Rudolfs Bundulis Mar 21 '18 at 21:29
  • Not at all. All I know about H.264 is that is basicly a collection of multiple compression algorithms which is pretty much standard for everything right now. Additionally I know that the compression is based on previous frames this is why you can't cut H.264 without reencoding. And there are options to controll the video size like Bitrate / Quality control. – Crigges Mar 21 '18 at 21:33
  • 1
    @Crigges well, then I'd say either adapt the example from the link with your encoding pipeline or try ffmpeg. Ffmpeg abstracts some stuff - you would be able to pass in raw frames to the H.264 encoder and then write them into the container without worrying about SPS/PPS and stuff. – Rudolfs Bundulis Mar 21 '18 at 21:48
  • Thank you for your help, I will try to dig into NAL / SPS / PPS and see if I can get it working. FFmpeg is not a real option otherwise I would have used it in first place. – Crigges Mar 21 '18 at 21:59
  • @Crigges why is FFmpeg not an option? – Rudolfs Bundulis Mar 21 '18 at 22:35
  • As I said I want a to create an open source H.264/mp4 encoder based on BSD license. – Crigges Mar 21 '18 at 23:09
  • You are mixing things up. FFmpeg is not a codec, but a framework that wraps many different codec implementations under a nice API. See https://stackoverflow.com/questions/42220081/how-to-choose-between-openh264-and-x264-decoder for an example of using FFmpeg with Open H.264. Im advising FFmpeg only because of its ability to switch codecs without messing with new encoder APIs. You can still use Open H.264 under the hood. – Rudolfs Bundulis Mar 21 '18 at 23:17
  • But not under BSD license. I would need to recompile FFmpeg without "--enable-gpl" and without "--enable-nonfree" and put a lot of effort into not violating the LGPL see https://www.ffmpeg.org/legal.html I want to open source a simple encoder with included muxing for mp4 files under BSD license. – Crigges Mar 21 '18 at 23:21
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/167302/discussion-between-crigges-and-rudolfs-bundulis). – Crigges Mar 21 '18 at 23:28
  • Ok, if you want BSD then I understand the motivation, but then again, and sorry for my arrogance, but at the point where H.265/VP9/AV1 are becoming the dominating codecs this make little sense to me. – Rudolfs Bundulis Mar 21 '18 at 23:29