3

I have one mkv file that doesn't have valid duration. I want to change this duration parameter manually. I gone through this matroska specification defined at http://www.matroska.org/technical/specs/index.html

Looking at specification for matroska this contains only identification magic numbers, but this doesn't specify length for data.

How to parse this matroska header so that i get duration field and change this field?

llogan
  • 121,796
  • 28
  • 232
  • 243
Parth Shah
  • 65
  • 5

4 Answers4

2

The type of the Duration field is float. According to the documentation it can be either 4 or 8 octets. To know which size it is, you have to look at the data size part of the field. The data size part uses an UTF-8 like system. It's explained here.

Vincent
  • 648
  • 3
  • 9
0

You can use ffprobe to get duration of .mkv file :

$ ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 input.mkv
5012.640000

See : https://trac.ffmpeg.org/wiki/FFprobeTips#Duration

Calumah
  • 2,865
  • 1
  • 20
  • 30
  • 1
    Although your answer will properly show the duration, the question is about how to manually change the duration. – llogan May 14 '18 at 18:27
  • 2
    For your information, question title before editing was : "Parse mkv file to get duration" .... – Calumah May 19 '20 at 07:13
0

Matroska uses EBML (https://en.wikipedia.org/wiki/Extensible_Binary_Meta_Language) as a base. EBML elements use variable sized integers, so it's not trivial to just change numbers. You can use FFmpeg to remux an MKV, which will rewrite the header with the correct duration.

Zipdox
  • 7
  • 2
0

I'm sorry for long introduction, but your case is described at the end of the note

EBML format in matroska:
https://github.com/ietf-wg-cellar/ebml-specification/blob/master/specification.markdown
matroska specification:
https://www.matroska.org/technical/elements.html

  • mkv file consist from blocks,
  • every block has a following structure - consists from these 3 sections - ID, DATA_LENGTH, DATA

ID - field identification section

  • count zero bits till first 1, to estimate length of this section in bytes, the next part till end of this byte(s) section is matroska element ID:

example

   1f 43 b6 75

0001 1111 -> first nibble determines a 4 bytes block for ID (the first 1 in binary form is on fourth position from left), and the ID is 0xF43b675 - and it means cluster, may in matroska specification look for the full id 0x1f43b675

   44 89

0100 0100 -> first nibble estimates a 2 byte section for ID (the first 1 in binary form is on second position from left), and the ID is 0x489 - and it means Duration, in matroska specification match for 0x4489

DATA LENGTH - after the ID section follows size specification, also BML

  • count zero bits till first 1, to estimate length in bytes of this section, the next part till end of this (byte) section is data length in bytes

     01 00 00 00 00 17 ff 0d
    

0000 0001 - for the first byte the section length is 8 bytes, related data length is 0x17ff0d

   88

1000 1000 - the section length is 1 byte, related data length is 8 byte

complexe sample
00000000 4d 80 a5 47 53 74 72 65 M..GStre
00000008 61 6d 65 72 20 6d 61 74 amer mat
00000010 72 6f 73 6b 61 6d 75 78 roskamux
00000018 20 76 65 72 73 69 6f 6e version
00000020 20 31 2e 31 30 2e 34 00 1.10.4.

   4d 80

0100 1100 - define 2 bytes long ID 0xD80 - it means a Muxing application or library

   a5

1010 0101 - define a 1 byte length block, data area has a 0x25 bytes

DATA - data block
00000000 47 53 74 72 65 M..GStre
00000008 61 6d 65 72 20 6d 61 74 amer mat
00000010 72 6f 73 6b 61 6d 75 78 roskamux
00000018 20 76 65 72 73 69 6f 6e version
00000020 20 31 2e 31 30 2e 34 00 1.10.4.

Note:

  • strings are zero terminated, terminating zero is included in the size
  • numbers are in big endian, no swaping is necessary, so 1000000 you see as 0f 42 40

see example
Timestamp scale: 00000000 2a d7 b1 83 0f 42 40

   2a

0010 1010 - three bytes for ID, ID is 0xAD7B1

   83

1000 0011 - one byte for length, data length is 3

0f 42 40 - data as BE integer - 1000000

Duration:
00000000 44 89 88 41 22 85 dd 26 D..A"..&
00000008 35 c5 b5 5..

   44

0100 0100 - two bytes for ID section (1 is in second position from left), ID is 0x489, in matroska specification look for 0x4489

   88

1000 1000 - one byte for length section (1 in the first position from left), data length is 8 bytes (the remaining 7 bites)

   41 22 85 dd 26 35 c5 b5
  • eight bytes of data, these data are double type, and gives 606958.57462900004 - 10 minutes and 6.958574629 seconds (and the VLC player shows this time).
    In C++:

    __int64 durationRaw = 0x412285dd2635c5b5;
    double durationMiliSeconds = (double*)&durationRaw;

So for your case get desired Duration value, convert it to miliseconds as double (8 bytes float) and store its binary represenation as is in desired value in the Duration block.