While serving video to users on website there were few options to choose from. Namely HLS, Smooth streaming, Dash or HDS. Dash seemed to be a better choice. Looking at how it works is that it splits the file in many parts and streams it. Another option would by splitting the files manually. What is the difference between dash, and splitting mp4 files.
-
The manifest (mpd) file – szatmary May 04 '17 at 18:39
-
@szatmary is there any difference in total data transferred in both cases? – May 04 '17 at 19:20
-
99% of the data will be audio and video. So if it's the same encoded bitrate, no. – szatmary May 04 '17 at 19:22
-
@szatmary So about 1% extra bandwidth is expected? – May 04 '17 at 20:28
-
Extra over what? – szatmary May 04 '17 at 21:14
-
@szatmary just like you said 99% data will be audio and video, that what will be the latter 1% be? – May 05 '17 at 04:59
-
I didn't mean "exactly 99%". The usage of the phrase is generally accepted to mean "most" or "the vast majority". There are overheads in the manifests, containers and protocols. Different standards or implementations will have different amounts of overhead, But _the majority_ of the data will be audio and video. The reset is negligible – szatmary May 05 '17 at 16:31
-
@szatmary Thanks man – May 05 '17 at 17:58
1 Answers
Dash, Smooth streaming and HLS are all adaptive streaming technologies. These technologies allows you:
- Serve content in segments - each segment is small playable chunk of content (audio, video or even text - eg. captions). Length of single segment is usually few seconds. That's what makes it "streaming" technology and is very similar to what you could try to achieve by splitting MP4 files manually.
- Serve content in multiple quality levels - depending on network connection, performance and screen resolution of target device, player can use appropriate quality to reduce chance of buffering or stuttering. To make this work, segment with specific index in the stream must be exactly aligned (start and length) cross all quality levels - that is achieved during encoding. That's what makes it "adaptive" technology.
- Consume manifest - manifest is description of the whole content and all available quality levels. You can have single video content in 10+ quality levels with several different audio streams (different codecs or languages) also having few quality levels. To consume it you need to tell player where to find individual segments - that is the purpose of manifest. Different technologies have different format of manifest. Dash provides many options how to describe the content. The verbose option consist of single MP4 source file per quality level, and segment description is just byte offset from the beginning of the file and byte length till the end of the segment. But you can have more compact descriptions like segment template and requesting segments by index.
So while you could achieve all of that by creating your own protocol, why would you do that instead of using a standard?
To answer your question in comments: Is there any difference in total data transferred in both cases?
In general no. It is still the same video and audio content with addition of manifest. The manifest is a text file (easily GZiped) - its size is very dependent on the way how the content is described. In case of verbose option, it is dependent length of the content, average length of segment, number of streams and number of quality levels.
Once you start using full power of Dash and use lower quality levels for scenarios where client may not need or may not be capable of playing the higher qualities, you can significantly reduce amount of transferred data.

- 360,892
- 59
- 660
- 670