5

I'm working on a library that injects metadata into a .mp4 file to allow the video to be displayed correctly as a 360 video. The input file is a standard .mp4 file in the equirectangular format. I know what metadata needs to be injected I just do not know how to inject it.

I spent some time looking around for libraries that can do this but could only find ones for extracting metadata not injecting/embedding/writing it. The alternative I found was to use Spatial Media as a command line application to inject the metadata more easily. The problem is I know zero python whatsoever so I'm leaning towards a library/nuget package/ffmpeg script.

Does a good nuget package/library exist that can do this or should I go for the alternative option?

Edit 1

I have tried just pasting in the metadata into the correct place in the file, just in case it might work, but it didn't.

Edit 2

This is the metadata injected by Google's Spatial Media Tool which is what I am trying to achieve:

<?xml version="1.0"?><rdf:SphericaVideo
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:GSpherical="http://ns.google.com/videos/1.0/spherical/"><GSpherical:Spherical>true</GSpherical:Spherical><GSpherical:Stitched>true</GSpherical:Stitched><GSpherical:StitchingSoftware>Spherical Metadata Tool</GSpherical:StitchingSoftware><GSpherical:ProjectionType>equirectangular</GSpherical:ProjectionType></rdf:SphericalVideo>`

Edit 3

I've also tried to do it with ffmpeg like so: ffmpeg -i input.mp4 -movflags use_metadata_tags -metadata Spherical=true -metadata Stitched=true -metadata ProjectionType=equirectangular -metadata StitchingSoftware=StreetviewJourney -codec copy output.mp4

I think the issue with the ffmpeg method is that it does not contain the rdf:SphericalVideo part which allows the spherical video tags to be used.

Edit 4

When I extract the metadata using ffmpeg it contains the spherical tag in the logs but not when I output it to a ffmetadata file. This was the command I used: ffmpeg -i injected.mp4 -map_metadata -1 -f ffmetadata data.txt

This is the output of the log:

 fps, 60 tbr, 15360 tbn, 120 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Side data:
      spherical: equirectangular (0.000000/0.000000/0.000000)

Edit 5

I also tried to get the metadata using this command: ffprobe -v error -select_streams v:0 -show_streams -of default=noprint_wrappers=1 injected.mp4

This was the logs it outputted:

TAG:handler_name=VideoHandler
side_data_type=Spherical Mapping
projection=equirectangular
yaw=0
pitch=0
roll=0

I then tried to use this command but it didn't work: ffmpeg -i chapmanspeak.mp4 -movflags use_metadata_tags -metadata side_metadata_type="Spherical Mapping" -metadata projection=equirectangular -metadata yaw=0 -metadata pitch=0 -metadata roll=0 -codec copy output.mp4

Edit 6

I tried @VC.One's method but I must be doing something wrong because the output file is unplayable. Here is my code:

        public static void Metadata(string inputFile, string outputFile)
        {
            byte[] metadata = HexStringToByteArray("3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 22 31 2E 30 22 3F 3E 3C 72 64 66 3A 53 70 68 65 72 69 63 61 6C 56 69 64 65 6F 0A 78 6D 6C 6E 73 3A 72 64 66 3D 22 68 74 74 70 3A 2F 2F 77 77 77 2E 77 33 2E 6F 72 67 2F 31 39 39 39 2F 30 32 2F 32 32 2D 72 64 66 2D 73 79 6E 74 61 78 2D 6E 73 23 22 0A 78 6D 6C 6E 73 3A 47 53 70 68 65 72 69 63 61 6C 3D 22 68 74 74 70 3A 2F 2F 6E 73 2E 67 6F 6F 67 6C 65 2E 63 6F 6D 2F 76 69 64 65 6F 73 2F 31 2E 30 2F 73 70 68 65 72 69 63 61 6C 2F 22 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 70 68 65 72 69 63 61 6C 3E 74 72 75 65 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 70 68 65 72 69 63 61 6C 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 65 64 3E 74 72 75 65 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 65 64 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 69 6E 67 53 6F 66 74 77 61 72 65 3E 53 70 68 65 72 69 63 61 6C 20 4D 65 74 61 64 61 74 61 20 54 6F 6F 6C 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 69 6E 67 53 6F 66 74 77 61 72 65 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 50 72 6F 6A 65 63 74 69 6F 6E 54 79 70 65 3E 65 71 75 69 72 65 63 74 61 6E 67 75 6C 61 72 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 50 72 6F 6A 65 63 74 69 6F 6E 54 79 70 65 3E 3C 2F 72 64 66 3A 53 70 68 65 72 69 63 61 6C 56 69 64 65 6F 3E");
            byte[] stco = HexStringToByteArray("73 74 63 6F");
            byte[] moov = HexStringToByteArray("6D 6F 6F 76");
            byte[] trak = HexStringToByteArray("74 72 61 6B");

            byte[] file = File.ReadAllBytes(inputFile);

            //find trak
            int trakPosition = 0;
            for (int a = 0; a < file.Length - trak.Length; a++)
            {
                for (int b = 0; b < trak.Length; b++)
                {
                    if (file[a + b] != trak[b])
                        break;
                    if (b == trak.Length - 1)
                        trakPosition = a;
                }
            }
            if (trakPosition == 0)
                throw new FileLoadException();

            //add metadata
            int trakLength = BitConverter.ToInt32(new ArraySegment<byte>(file, trakPosition - 4, 4).Reverse().ToArray(), 0);
            var fileList = file.ToList();
            fileList.InsertRange(trakPosition - 4 + trakLength, metadata);
            file = fileList.ToArray();

            ////change length - tried this as well
            //byte[] trakBytes = BitConverter.GetBytes(trakLength + metadata.Length).Reverse().ToArray();
            //for (int i = 0; i < 4; i++)
            //    file[trakPosition - 4 + i] = trakBytes[i];

            //find moov
            int moovPosition = 0;
            for (int a = 0; a < file.Length - moov.Length; a++)
            {
                for (int b = 0; b < moov.Length; b++)
                {
                    if (file[a + b] != moov[b])
                        break;
                    if (b == moov.Length - 1)
                        moovPosition = a;
                }
            }
            if (moovPosition == 0)
                throw new FileLoadException();

            //change length
            int moovLength = BitConverter.ToInt32(new ArraySegment<byte>(file, moovPosition - 4, 4).Reverse().ToArray(), 0);
            byte[] moovBytes = BitConverter.GetBytes(moovLength + metadata.Length).Reverse().ToArray();
            for (int i = 0; i < 4; i++)
                file[moovPosition - 4 + i] = moovBytes[i];

            //find stco
            int stcoPosition = 0;
            for (int a = 0; a < file.Length - stco.Length; a++)
            {
                for (int b = 0; b < stco.Length; b++)
                {
                    if (file[a + b] != stco[b])
                        break;
                    if (b == stco.Length - 1)
                        stcoPosition = a;
                }
            }
            if (stcoPosition == 0)
                throw new FileLoadException();

            //modify entries
            int stcoEntries = BitConverter.ToInt32(new ArraySegment<byte>(file, stcoPosition + 8, 4).Reverse().ToArray(), 0);
            for (int a = stcoPosition + 12; a < stcoPosition + 12 + stcoEntries * 4; a += 4)
            {
                int entryLength = BitConverter.ToInt32(new ArraySegment<byte>(file, a, 4).Reverse().ToArray(), 0);
                byte[] newEntry = BitConverter.GetBytes(entryLength + metadata.Length).Reverse().ToArray();
                for (int b = 0; b < 4; b++)
                    file[a + b] = newEntry[b];
            }

            File.WriteAllBytes(outputFile, file);
        }

        private static byte[] HexStringToByteArray(string hex)
        {
            hex = hex.Replace(" ", "");
            return Enumerable.Range(0, hex.Length)
                             .Where(x => x % 2 == 0)
                             .Select(x => Convert.ToByte(hex.Substring(x, 2), 16))
                             .ToArray();
        }

The bytes are reversed because .mp4s seem to be Little Endian. I tried to also update the length of trak but that didn't work either.

TrueCP5
  • 358
  • 3
  • 14
  • I found this python package https://mutagen.readthedocs.io/en/latest/index.html maybe you could use parts of it in C# – Jochen Kühner Aug 10 '19 at 07:04
  • Try this one https://stackoverflow.com/questions/220097/read-write-extended-file-properties-c/46648086#46648086 – Walter Verhoeven Aug 10 '19 at 08:17
  • What atom does your metadata go into? How many bytes long is it? Try to add that length to STCO atom's entries of the a/v chunk offsets... – VC.One Aug 10 '19 at 10:43
  • @ComputerAidedTradingSystems How can I use that to write the metadata to a file and not just read it? – TrueCP5 Aug 10 '19 at 15:37
  • that should would be dependent on the file format you are using. Have a look at https://www.codeproject.com/Articles/9489/CMp3Tags-id3v1-1-Tag-Reader-Writer – Walter Verhoeven Aug 11 '19 at 15:27
  • 1
    Plus +1 for actively trying to solve the issue. Let me check why it's not working... – VC.One Aug 15 '19 at 12:29

1 Answers1

2

Short version:

  • Your metadata is 454 bytes long.

  • Add the metadata at the ending of trak atom.

  • Update size of trak by increasing its shown value by + 454.

  • Update size of moov by increasing its shown value by + 454.

  • Find stco and update each listed Offset entry by increasing their value(s) by + 454.

Then test video file as Youtube upload. Here is my example on Youtube.

Long version:

"I have tried just pasting in the metadata into the correct place in the file"

That will not work because it displaces (or pushes) the bytes of the audio/video data. These byte positions of a/v data are stored in the MP4 header, so if you add any new bytes in there, you'll have to also update other sections that rely on some previously correct offsets to display a frame.

I'm not sure which part you call "the correct place" (which Atom? under which Box?).
Whatever you do, after adding the metadata bytes, you must update sizes of MOOV, TRAK, and also the STCO has listed offset(s) and each must increase by adding metadata's bytes length to value.

Your metadata has a length of: 454 bytes.
This means sizes and offsets must be updated with a: This_Atom_Value += 454;

"I know what metadata needs to be injected I just do not know how to inject it."

Possible solution: This is what I did to get it working on Youtube...

1) First make sure you metadata is correct (not just as text, but also in the bytes/hex format).

For example, in your first line there is a missing "l" in "Spherical" and so it won't work.

  • You say: <rdf:SphericaVideo..xmlns.
  • Correct: <rdf:SphericalVideo.xmlns.

The correct metadata XML is:

<?xml version="1.0"?><rdf:SphericalVideo.xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#".xmlns:GSpherical="http://ns.google.com/videos/1.0/spherical/"><GSpherical:Spherical>true</GSpherical:Spherical><GSpherical:Stitched>true</GSpherical:Stitched><GSpherical:StitchingSoftware>Spherical Metadata Tool</GSpherical:StitchingSoftware><GSpherical:ProjectionType>equirectangular</GSpherical:ProjectionType></rdf:SphericalVideo>

Below are the correct bytes, of above metadata text, for adding to the MP4 file:

3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 22 31 2E 30 22 3F 3E 3C 72 64 66 3A 53 70 68 65 72 69 63 61 6C 56 69 64 65 6F 0A 78 6D 6C 6E 73 3A 72 64 66 3D 22 68 74 74 70 3A 2F 2F 77 77 77 2E 77 33 2E 6F 72 67 2F 31 39 39 39 2F 30 32 2F 32 32 2D 72 64 66 2D 73 79 6E 74 61 78 2D 6E 73 23 22 0A 78 6D 6C 6E 73 3A 47 53 70 68 65 72 69 63 61 6C 3D 22 68 74 74 70 3A 2F 2F 6E 73 2E 67 6F 6F 67 6C 65 2E 63 6F 6D 2F 76 69 64 65 6F 73 2F 31 2E 30 2F 73 70 68 65 72 69 63 61 6C 2F 22 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 70 68 65 72 69 63 61 6C 3E 74 72 75 65 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 70 68 65 72 69 63 61 6C 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 65 64 3E 74 72 75 65 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 65 64 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 69 6E 67 53 6F 66 74 77 61 72 65 3E 53 70 68 65 72 69 63 61 6C 20 4D 65 74 61 64 61 74 61 20 54 6F 6F 6C 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 53 74 69 74 63 68 69 6E 67 53 6F 66 74 77 61 72 65 3E 3C 47 53 70 68 65 72 69 63 61 6C 3A 50 72 6F 6A 65 63 74 69 6F 6E 54 79 70 65 3E 65 71 75 69 72 65 63 74 61 6E 67 75 6C 61 72 3C 2F 47 53 70 68 65 72 69 63 61 6C 3A 50 72 6F 6A 65 63 74 69 6F 6E 54 79 70 65 3E 3C 2F 72 64 66 3A 53 70 68 65 72 69 63 61 6C 56 69 64 65 6F 3E

Note: the . between SphericalVideo and xmlns is not a fullstop (byte: 2E) but is instead a line-break marker (byte: 0A). If you simply copy/paste the XML text itself, then you may be missing some bytes which don't show as text (becoming full-stops or blank spaces depending on your text viewer).

2) Adding the metadata.

(a) Update TRAK by adding the 454 metadata bytes at the ending of the trak box.

To find this trak look for bytes 74 72 61 6B and note the position as integer atom_start. The previous 4 bytes hold the size of this atom. So going to position atom_start minus 4 and doing readInt will give you the atom_size.

(b) From position atom_start minus 4 you now jump forward by atom_size and then insert your metadata bytes there.

3) Updating sizes and offsets.

(a) Update TRAK size:
Go back to position atom_start minus 4 and now overwrite the current size with logic of new_Size = current_Size + 454; to account for these newly added bytes.

(b) Update MOOV size:
Find this moov which is bytes 6D 6F 6F 76. The previous 4 bytes are a size integer and you overwrite the current size by adding +454 to account for the size increase.

(c) Update STCO offsets:
Find stco which is bytes 73 74 63 6F. Skip these 4 bytes and also skip another 4 (usually listed as 00 00 00 00). Now the next 4 bytes tells you how many entries of offsets are listed, use that number to know when to stop updating the following integer values. You can use a For-loop for updates.

Example of an STCO atom:

73 74 63 6F 00 00 00 00 00 00 00 02 00 00 04 E5 00 00 07 BB

Meaning...

73 74 63 6F : Makes text stco text.
00 00 00 00 : Skip these bytes.
00 00 00 02 : Means there are 2 entries for offsets.
00 00 04 E5 : Entry 1 = 1253 (so add +454 to this value).
00 00 07 BB : Entry 2 = 1979 (so add +454 to this value).

Structure of STCO

Structure of byte offsets within STCO

VC.One
  • 14,790
  • 4
  • 25
  • 57
  • 1
    I tried the method but I must be doing something wrong in my code. I've updated my question with the code if you can check it for problems that would be great. – TrueCP5 Aug 21 '19 at 12:32