Is it possible to do h264 encoding before streaming a video with gstreamer, to save time when serving frames?

Question

I'm somewhat new to gstreamer and wanted to set up an RTSP server in python that streams either mp4's or raw h264 chunks. I noticed that the streams can take up a lot of CPU time and I assume that's due to encoding each frame in h264. The video files are already h264 encoded (even if they aren't, I could just encode them as h264 with ffmpeg beforehand), so wouldn't it make sense to just parse them and throw then on to the next point in the pipe? But I'm not entirely certain how the gstreamer launch string operates.

The current launch string that I've got that works is as follows:

launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
                             'caps=video/x-raw,format=BGR,width={},height={},framerate={}/1 ' \
                             '! videoconvert ! video/x-raw,format=I420 ' \
                             '! x264enc speed-preset=veryfast tune=zerolatency ' \
                             '! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
                             '! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)

I've tried a few others like:

launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
                             'caps=video/x-h264,format=BGR,width={},height={},framerate={}/1 ' \
                             '! h264parse ' \
                             '! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
                             '! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)

but I end up with empty H.264 RTP packets. I think this is just because I don't understand how the pipeline works, so any explanation would be helpful.

This is also my main loop:

    def on_need_data(self, src, length):

        if self.number_frames >= (self.max_frame): # Loop the video or exit
            if LOOP:
                self.reset_video()
            else:
                Gloop.quit()

        if self.cap.isOpened():
            ret, frame = self.cap.read()
            if ret:
                if frame.shape[:2] != (self.height, self.width):
                    if self.debug >=2:
                        print("Resizing frame")
                        print(frame.shape[:2])
                        print((self.height, self.width))
                    frame = cv2.resize(frame, (self.width, self.height))
                data = frame.tobytes()
                buf = Gst.Buffer.new_allocate(None, len(data), None)
                buf.fill(0, data)
                buf.duration = self.duration
                timestamp = self.timestamp_frame * self.duration
                buf.pts = buf.dts = int(timestamp)
                buf.offset = timestamp
                self.number_frames += 1
                self.timestamp_frame += 1
                retval = src.emit('push-buffer', buf)
                if self.debug >= 2:
                    print('pushed buffer to {}, frame {}, duration {} ns, durations {} s'.format(self.device_id,
                                                                                       self.timestamp_frame,
                                                                                       self.duration,
                                                                                       self.duration / Gst.SECOND))
                if retval != Gst.FlowReturn.OK:
                    print("[INFO]: retval not OK: {}".format(retval))
                if retval == Gst.FlowReturn.FLUSHING:
                    print('Offline')
            else:
                if self.debug > 0:
                    print("[INFO]: Unable to read frame from cap: ")
                    print(self.device_id)
                    print(self.number_frames)
                    print(self.max_frame)
                Gloop.quit()

Yelnat · Answer 1 · 2023-03-25T00:36:14.343

I ended up using filesrc to solve my issue, but this is a little bit more finicky than I thought, I eventually got my solution by combining this answer: Seamless video loop in gstreamer with Where are Gstreamer bus log messages?.

The key is to use the Seek event with a start segment at the beginning of the video and an end segment at the end of the video:

def seek_video(self):
    if opt.debug >= 1:
        print("Seeking...")
    self.my_player.seek(1.0,
          Gst.Format.TIME,
          Gst.SeekFlags.SEGMENT,
          Gst.SeekType.SET, 0,
          Gst.SeekType.SET, self.video_length * Gst.SECOND)

This causes the "SEGMENT_DONE" message to be broadcast, which can then be intercepted:

if message.type == Gst.MessageType.SEGMENT_DONE:
    self.seek_video()

I've also made an example available at my GitHub as well as a short Blog post talking about what I've leaned

score 0 · Answer 2 · answered Feb 13 '23 at 15:06

0

if self.cap.isOpened():
       ret, frame = self.cap.read()

This is where your data is coming from. The snippet you have included does not show what cap is, but probably some kind of capture source. Perhaps OpenCV?

In any case, you need to modify your data source to provide you with h264 packets instead of decoded frames. After that, your second pipeline has a better chance of working.

answered Feb 13 '23 at 15:06

jpa

10,351
1
28
45

Yes, cap is a H264 encoded mp4 file, opened with OpenCV. I have another example that reads raw h264 chunks from a websocket which I will see if that works. I also used ffmpeg to convert the mp4 into a .h264 file and found the same issue of empty h264 rtp packets. – Yelnat Feb 13 '23 at 15:24
@Yelnat If it is from a file, why not just use filesrc in gstreamer? OpenCV is decoding the frames for you. – jpa Feb 13 '23 at 17:11
I might need to do some extra AWS/ other orchestration stuff in the background while the video is running (for doing things like restarting the video when SNS notifies an endpoint, or looping the video when it ends). The easiest way I thought about doing this was making the server in python so that I had some flexibility later. I could also try just using filesrc in the launch string and then start/stopping it from within? I'm uncertain if that's possible. – Yelnat Feb 14 '23 at 09:12
Yes, you can control filesrc in many ways. Though I'm not sure how well looping a h264-compressed video works, it may need some tricks to be able to do so without decoding the frames. – jpa Feb 14 '23 at 11:04
Thank you, I will try messing around with it and post back here if I find any fruitful results – Yelnat Feb 14 '23 at 12:42
An interesting update, When using a websocket streaming raw h264 chunks the previous pipeline works, but I can't seem to open an mp4 and change it to raw binary/ raw h264 frames. I'll update with some more info in my answer if I can find a way to do it – Yelnat Feb 15 '23 at 11:28

Is it possible to do h264 encoding before streaming a video with gstreamer, to save time when serving frames?

2 Answers2