Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.
2 Answers
A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
Boxes aka Atoms
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
Non-Fragmented MP4
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat
(Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat
in each respective file is not trivial.
Fragmented MP4
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand
profile.
You'll notice the sidx
and series of moof
+mdat
boxes. The sidx
is the Segment Index and stores meta data of the precise byte range locations of the moof
+mdat
segments.
Essentially, you can independently load the sidx
(its byte-range will be defined in the accompanying .mpd
Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.

- 3,541
- 28
- 38
- 38

- 9,014
- 3
- 33
- 46
-
11A concise specification of what fMP4 is can also be found in the [HLS specification](https://tools.ietf.org/html/draft-pantos-http-live-streaming-23) in section 3.3. – slhck Sep 13 '17 at 08:23
-
Is sidx stored in mp4 header or some byte range? I would like to fetch the all info about sidx but don't want to download whole mp4 file and parse. thanks – 鄭元傑 Sep 10 '18 at 10:07
-
I wonder where one could find the specifications of the moof frames. There seems to be a counter and timestamp information which I need to access. – El Sampsa Jan 10 '19 at 20:08
-
1So sending an moof+mdat and appending that to SourceBuffer works? Is that what media source expecting. So for example can I just concatenate 10 Frames of h264 data one after another and create only one mdat and make that wrap that 10 frame concatenated h264 data. so something like moof+mdat(10*h264 data) – Evren Bingøl Feb 01 '19 at 22:41
-
@EvrenBingøl , Did u tried your approach - moof+mdat(10*h264_data). Does Sourcebuffer works with that data ? – Kumar Sep 16 '20 at 08:56
Media File Formats
Media data streams are wrapped in a container format. The container includes the physical data of the media but also metadata that are necessary for playback. For example it signals to the video player the codec used, subtitles tracks etc. In video streaming there are two main formats that are used for storage and presentation of multimedia content: MPEG- 2 Transport Streams (MPEG-2 TS)[25] and ISO Base Media File Formats (ISOBMFF)[24](MP4 and fragmented MP4).
MPEG-2 Transport Streams are specified by [25] and are designed for broadcasting video through satellite networks. However, Apple adopted it for its adaptive streaming protocol making it an important format. In MPEG-2 TS audio, video and subtitle streams are multiplexed together. MP4 and fragmented MP4 (fMP4), are both part of the MPEG-4, Part 12 standard that covers the ISOBMFF. MP4 is the most known multimedia container format and it’s widely supported in different operating systems and devices. The structure of an MP4 video file, is shown in figure 2.2a. As shown, MP4 consist of different boxes, each with a different function- ality. These boxes are the basic building block of every container in MP4.
For example the file type box (’ftyp’), specifies the compatible brands (spe- cifications) of the file. MP4 files have a Movie Box (’moov’) that contains metadata of the media file and sample tables that are important for timing and indexing the media samples (’stbl’). Also there is a Media Data Box (’mdat’) that contains the corresponding samples. In the fragmented con- tainer, shown in figure 2.2b, media samples are interleaved by using Movie Fragment boxes (’moof’) which contain the sample table for the specific fragment(mdat box).
Ref : https://repository.tudelft.nl/islandora/object/uuid%3Ae06cde4c-1514-4a8d-90be-7e10eee5aac1

- 8,607
- 10
- 51
- 71

- 61
- 1
- 4