4

I'm attempting to stream a H.264 video feed to a web browser. Media Foundation is used for encoding a fragmented MPEG4 stream (MFCreateFMPEG4MediaSink with MFTranscodeContainerType_FMPEG4, MF_LOW_LATENCY and MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS enabled). The stream is then connected to a web server through IMFByteStream.

Streaming of the H.264 video works fine when it's being consumed by a <video src=".."/> tag. However, the resulting latency is ~2sec, which is too much for the application in question. My suspicion is that client-side buffering causes most of the latency. Therefore, I'm experimenting with Media Source Extensions (MSE) for programmatic control over the in-browser streaming. Chrome does, however, fail with the following error when consuming the same MPEG4 stream through MSE:

Failure parsing MP4: TFHD base-data-offset not allowed by MSE. See https://www.w3.org/TR/mse-byte-stream-format-isobmff/#movie-fragment-relative-addressing

mp4dump of a moof/mdat fragment in the MPEG4 stream. This clearly shows that the TFHD contains an "illegal" base data offset parameter:

[moof] size=8+200
  [mfhd] size=12+4
    sequence number = 3
  [traf] size=8+176
    [tfhd] size=12+16, flags=1
      track ID = 1
      base data offset = 36690
    [trun] size=12+136, version=1, flags=f01
      sample count = 8
      data offset = 0
[mdat] size=8+1624

I'm using Chrome 65.0.3325.181 (Official Build) (32-bit), running on Win10 version 1709 (16299.309).

Is there any way of generating a MSE-compatible H.264/MPEG4 video stream using Media Foundation?

Status Update:

Based on roman-r advise, I managed to fix the problem myself by intercepting the generated MPEG4 stream and perform the following modifications:

  • Modify Track Fragment Header Box (tfhd):
    • remove base_data_offset parameter (reduces stream size by 8bytes)
    • set default-base-is-moof flag
  • Add missing Track Fragment Decode Time (tfdt) (increases stream size by 20bytes)
    • set baseMediaDecodeTime parameter
  • Modify Track fragment Run box (trun):
    • adjust data_offset parameter

The field descriptions are documented in https://www.iso.org/standard/68960.html (free download).

Switching to MSE-based video streaming reduced the latency from ~2.0 to 0.7 sec. The latency was furthermore reduced to 0-1 frames by calling IMFSinkWriter::NotifyEndOfSegment after each IMFSinkWriter::WriteSample call.

There's a sample implementation available on https://github.com/forderud/AppWebStream

  • I am not aware of a method to alter the behavior, however I used to work this problem around in past by both post-processing the produced byte stream, and by creating an alternate media sink. Even though it is not what you are asking about exactly, both (either of the two) path lead to MSE-friendly output from Media Foundation pipeline. – Roman R. Mar 22 '18 at 14:55
  • 1
    Post-processing is unfortunately not an option, since what is streamed is a "live" video feed. However, an alternative media sink could be worth investigating. Can you please provide more details and/or point me in the direction of some sample code? – Fredrik Orderud Mar 22 '18 at 15:16
  • I don't have any code I can share, sorry for this. Post-processing (yes I did it for live low latency feed) is intercepting a byte stream, parsing it into atoms and getting the stuff re-composed back. In general, it's doable. Custom media sink is straightforward development which replaces stock sink. MF is notorious for not having many samples, but maybe you will be able to find some media sink sample to start from. Wavsink sample from Win SDK 7.x might be a good example. – Roman R. Mar 22 '18 at 15:36

4 Answers4

3

I was getting the same error (Failure parsing MP4: TFHD base-data-offset not allowed by MSE) when trying to play a fmp4 via MSE. The fmp4 had been created from a mp4 using the following ffmpeg comand:

ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov myfmp4video.mp4

Based on this question I was able to find out that to have the fmp4 working in Chrome I had to add the "default_base_moof" flag. So, after creating the fmp4 with the following command:

ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof myfmp4video.mp4

I was able to play successfully the video using Media Source Extensions.

This Mozilla article helped to find out that missing flag: https://developer.mozilla.org/en-US/docs/Web/API/Media_Source_Extensions_API/Transcoding_assets_for_MSE

rsc
  • 10,348
  • 5
  • 39
  • 36
1

The mentioned 0.7 sec latency (in your Status Update) is caused by the Media Foundation's MFTranscodeContainerType_FMPEG4 containterizer which gathers and outputs each roughly 1/3 seconds (from unknown reason) of frames in one MP4 moof/mdat box pair. This means that you need to wait 19 frames before getting any output from MFTranscodeContainerType_FMPEG4 at 60 FPS.

To output single MP4 moof/mdat per each frame, simply lie that MF_MT_FRAME_RATE is 1 FPS (or anything higher than 1/3 sec). To play the video at the correct speed, use Media Source Extensions' <video>.playbackRate or rather update timescale (i.e. multiply by real FPS) of mvhd and mdhd boxes in your MP4 stream interceptor to get the correctly timed MP4 stream.

Doing that, the latency can be squeezed to under 20 ms. This is barely recognizable when you see the output side by side on localhost in chains such as Unity (research) -> NvEnc -> MFTranscodeContainerType_FMPEG4 -> WebSocket -> Chrome Media Source Extensions display.

Note that MFTranscodeContainerType_FMPEG4 still introduces 1 frame delay (1st frame in, no output, 2nd frame in, 1st frame out, ...), hence the 20 ms latency at 60 FPS. The only solution to that seems to be writing own FMPEG4 containerizer. But that is order of magnitude more complex than intercepting of Media Foundation's MP4 streams.

svobodb
  • 56
  • 5
  • Thanks a lot for a very helpful suggestion @svobodb! I've now updated my AppWebStream sample project to pretend 1 FPS to reduce the latency to just a 1 frame delay. – Fredrik Orderud Nov 27 '20 at 11:22
  • I was recently tipped of a simpler way of reducing the latency than to fiddle with the time-stamps. Calling IMFSinkWriter::NotifyEndOfSegment after each WriteSample seem to do the trick of reducing the encoding latency to just 0-1 frames. – Fredrik Orderud Aug 29 '21 at 19:54
0

The problem was solved by following roman-r's advise, and modifying the generated MPEG4 stream. See answer above.

0

Another way to do this is again using the same code @Fredrik mentioned but I write my own IMFByteStream and and I check the chunks written to the IMFByteStream. FFMpeg writes the atoms almost once at a time. So you can check the atom name and do the mods. It is the same thing. I wish there was an MSE compliant windows sinker.

Is there one that can generate .ts files for HLS?

Evren Bingøl
  • 1,306
  • 1
  • 20
  • 32