0

I am using the FFmpeg internal APIs to capture audio and video to a file using an isvm muxer.

Prior to writing the file header, the audio ACC stream time_base is set to 1/44100 and the video h264 stream time_base is set to 1/30. Despite these settings, invoking avformat_write_header(oc, options), ffmpeg internally forces the time_base for both streams to 1/10000000. Looking at the internal source for avformat_write_header, it can be seen that lazy initialization of the AVFormatContext is invoked. For both mp4 and ismv, lazy init will invoke mov_init. However, since ismv has mov->mode == MODE_ISM, it overwrites any stream time_base with the value of 1/10000000, as can be seen on line 6230 in mov_init. mp4 on the other hand allows the streams to maintain a timebase consistent with their associated codec configuration.

The logic to only allow a single timebase was added when ISMV support was added to ffmpeg. Does anyone know why this is necessary (except to support mp4split tooling as stated in the code comments)?

I am finding this confusing and problematic as it relates to writing pts (presentation timestamp) values. I'm relatively new to this space, but my understanding is that:

  1. Timebase is expressed as units per second. This means for ISMV the value pts=1 is 0.1 microseconds.
  2. The maximum supported pts value in an ISMV is 2^33 or 8589934592. This limits the max pts of about 859 seconds.

Since I am scaling my packet pts before writing them using av_packet_rescale_ts(packet, codec.time_base, stream.time_base) this results in large values of pts. I have read references to allowing pts to rollover at 2^33. Is this the correct way to deal with the ISMV timebase? Is there something else I am missing.

Thanks in advance!

NaderNader
  • 1,038
  • 1
  • 11
  • 16
  • 1
    If you can locate the ISMV spec which allows for other timescales, I'll submit a patch to get it changed. – Gyan Jan 04 '19 at 06:54
  • 1
    It likely stems from Smooth Streaming protocol standard that utilizes ISMV, where Microsoft specifies default timescale as 10000000. See for instance section 2.2.2.1 in https://winprotocoldoc.blob.core.windows.net/productionwindowsarchives/MS-SSTR/[MS-SSTR].pdf. The value being default seemingly does not mean that it is mandatory, so changing it should be possible, but as you saw in that mp4split comment, tools and players handling ISMV fragments might be expecting the default value. – tbucher Jan 04 '19 at 09:59
  • Thank you tbucher for posting the link to the specification. It indicates that the timescale can base specified and will default to 10000000 if not specified. – NaderNader Jan 04 '19 at 13:58

0 Answers0