5

I have a Python program that receives a sequence of H264 video frames over the network, which I want to display and, optionally, record. The camera records at 30FPS and sends frames as fast as it can, which isn't consistently 30FPS due to changing network conditions; sometimes it falls behind and then catches up, and rarely it drops frames entirely.

The "display" part is easy; I don't need to care about timing or stream metadata, just display the frames as fast as they arrive:

input = av.open(get_video_stream())
for packet in input.demux(video=0):
  for frame in packet.decode():
    # A bunch of numpy and pygame code here to convert the frame to RGB
    # row-major and blit it to the screen

The "record" part looks like it should be easy:

input = av.open(get_video_stream())
output = av.open(filename, 'w')
output.add_stream(template=input.streams[0])
for packet in input.demux(video=0):
  for frame in packet.decode():
    # ...display code...
  packet.stream = output.streams[0]
  output.mux_one(packet)
output.close()

And indeed this produces a valid MP4 file containing all the frames, and if I play it back with mplayer -fps 30 it works fine. But that -fps 30 is absolutely required:

$ ffprobe output.mp4
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 960x720,
                  1277664 kb/s, 12800 fps, 12800 tbr, 12800 tbn, 25600 tbc (default)

Note that 12,800 frames/second. It should look something like this (produced by calling mencoder -fps 30 and piping the frames into it):

$ ffprobe mencoder_test.mp4
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 960x720,
                  2998 kb/s, 30 fps, 30 tbr, 90k tbn, 180k tbc (default)

Inspecting the packets and frames I get from the input stream, I see:

stream: time_base=1/1200000
codec: framerate=25 time_base=1/50
packet: dts=None pts=None duration=48000 time_base=1/1200000
frame: dst=None pts=None time=None time_base=1/1200000

So, the packets and frames don't have timestamps at all; they have a time_base which doesn't match either the timebase that ends up in the final file or the actual framerate of the camera; the codec has a framrate and timebase that doesn't match the final file, the camera framerate, or the other video stream metadata!

The PyAV documentation is all but entirely absent when it comes to issues of timing and framerate, but I have tried manually setting various combinations of stream, packet, and frame time_base, dts, and pts with no success. I can always remux the recorded videos again to get the correct framerate, but I'd rather write video files that are correct in the first place.

So, how do I get pyAV to remux the video in a way that produces an output that is correctly marked as 30fps?

ToxicFrog
  • 2,554
  • 18
  • 18
  • 1
    Did you ever figure this out? I'm using PyAv to grab a networked webcam and cannot for the life of me figure out how to set the framerate. – Paul Allsopp Apr 05 '20 at 23:37

0 Answers0