Create rtsp stream based on opencv images in python

Question

My goal is to read frames from an rtsp server, do some opencv manipulation, and write the manipulated frames to a new rtsp server.

I tried the following based on Write in Gstreamer pipeline from opencv in python, but I was unable to figure out what the appropriate gst-launch-1.0 arguments should be to create the rtsp server. Can anyone assist with proper arguments to gst-launch-1.0? The ones I tried got stuck in "Pipeline is PREROLLING"

import cv2

cap = cv2.VideoCapture("rtsp://....")

framerate = 25.0

out = cv2.VideoWriter('appsrc ! videoconvert ! '
  'x264enc noise-reduction=10000 speed-preset=ultrafast 
   tune=zerolatency ! '
  'rtph264pay config-interval=1 pt=96 !'
  'tcpserversink host=192.168.1.27 port=5000 sync=false',
  0, framerate, (640, 480))


counter = 0
while cap.isOpened():
  ret, frame = cap.read()
  if ret:
    out.write(frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
      break
  else:
    break

cap.release()
out.release()

I also tried another solution based on Write opencv frames into gstreamer rtsp server pipeline

import cv2
import gi 

gi.require_version('Gst', '1.0')
gi.require_version('GstRtspServer', '1.0') 
from gi.repository import Gst, GstRtspServer, GObject

class SensorFactory(GstRtspServer.RTSPMediaFactory):
  def __init__(self, **properties): 
    super(SensorFactory, self).__init__(**properties) 
    #self.cap = cv2.VideoCapture(0)
    self.cap = cv2.VideoCapture("rtsp://....")
    self.number_frames = 0 
    self.fps = 30
    self.duration = 1 / self.fps * Gst.SECOND  # duration of a frame in nanoseconds 
    self.launch_string = 'appsrc name=source is-live=true block=true format=GST_FORMAT_TIME ' \
                         'caps=video/x-raw,format=BGR,width=640,height=480,framerate={}/1 ' \
                         '! videoconvert ! video/x-raw,format=I420 ' \
                         '! x264enc speed-preset=ultrafast tune=zerolatency ' \
                         '! rtph264pay config-interval=1 name=pay0 pt=96'.format(self.fps)
  
  def on_need_data(self, src, lenght):
    if self.cap.isOpened():
      ret, frame = self.cap.read()
      if ret:
        data = frame.tostring() 
        buf = Gst.Buffer.new_allocate(None, len(data), None)
        buf.fill(0, data)
        buf.duration = self.duration
        timestamp = self.number_frames * self.duration
        buf.pts = buf.dts = int(timestamp)
        buf.offset = timestamp
        self.number_frames += 1
        retval = src.emit('push-buffer', buf) 
        
        print('pushed buffer, frame {}, duration {} ns, durations {} s'.format(self.number_frames, self.duration, self.duration / Gst.SECOND)) 

        if retval != Gst.FlowReturn.OK: 
          print(retval) 

  def do_create_element(self, url): 
    return Gst.parse_launch(self.launch_string) 

  def do_configure(self, rtsp_media): 
    self.number_frames = 0 
    appsrc = rtsp_media.get_element().get_child_by_name('source') 
    appsrc.connect('need-data', self.on_need_data) 


class GstServer(GstRtspServer.RTSPServer): 
  def __init__(self, **properties): 
    super(GstServer, self).__init__(**properties) 
    self.factory = SensorFactory() 
    self.factory.set_shared(True) 
    self.get_mount_points().add_factory("/test", self.factory) 
    self.attach(None) 


GObject.threads_init() 
Gst.init(None) 

server = GstServer() 

loop = GObject.MainLoop() 
loop.run()

This solution generates the rtsp server and streams it to it. I can open the resulting rtsp stream in VLC, but it keeps displaying the first frame and does not update with new frames. Anyone who knows why?

I'm looking for any solution which will enable me with low latency to read frames from an rtsp server into an opencv format, manipulate the frames and output the frames into a new rtsp server (which I also need to create). If something better exists, the solution does not need to be based on gstreamer.

I am using Ubuntu 16.04 with python2.7 and opencv 3.4.1

[1](https://stackoverflow.com/a/46636126/2286337), [2](https://stackoverflow.com/a/47045135/2286337), [3](https://stackoverflow.com/a/50917584/2286337) — zindarod, Jun 27 '18 at 09:27
@zindarod I tried your approach in https://stackoverflow.com/a/46636126/2286337 . I can start sender and receiver with gst-launch and see my webcam. However, the opencv code for the sender will not open VideoCapture or VideoWriter — Max la Cour Christensen, Jun 27 '18 at 10:46
OpenCV needs to have Gstreamer support for this to work. In the output of function `cv2.getBuildInformation()`, search for Gstreamer and see if it's been included. — zindarod, Jun 27 '18 at 13:03
@zindarod thx! i now got it working by compiling opencv with gstreamer enabled. For your example in https://stackoverflow.com/a/46636126/2286337 I can watch the rtsp stream with gst-launch but how can I get vlc to open the rtsp stream? rtsp://my_ip:5000/??? — Max la Cour Christensen, Jun 28 '18 at 17:24
@zindarod i think that i need to create an sdp file, but i am unsure what the appropriate content should be to match the codec and everything — Max la Cour Christensen, Jun 28 '18 at 17:40
@Max la Cour Christensen Did you ever get this working? I tried your second example and also am stuck with only a single (corrupted?) frame appearing in VLC when I connect to the re-stream. — Steve Osborne, Mar 11 '19 at 19:10
@SteveOsborne I ended up abandoning gstreamer for a C++ solution based on live555 and ffmpeg — Max la Cour Christensen, Mar 13 '19 at 08:00
For future reference, the second example is working for me if I change this line: "self.duration = 1.0 / self.fps * Gst.SECOND". This was apparent in the print statement; both duration values were 0 because of Python integer division. — Steve Osborne, Mar 15 '19 at 13:18

score 1 · Answer 1 · answered Jan 24 '22 at 18:02

I did once a similar thing, reading frames from RTSP server and processing them within OpenCV. For some reason I could not use VideoCapture of cv2, it did not work. So my solution was to use ffmpeg to convert RTSP input into a stream of bitmaps, for my problem it was ok to read the grayscale image with 1 byte per pixel.

The basic implementation idea was:

Running ffmpeg process, which is my start_reading() method;
Having a thread which reads bytes from ffmpeg's stdout frame by frame within a pipe;
Having a property of the class which returns the last frame from ffmpeg. Note that this is asynchronous reading, as you could see from the code, but worked fine to me;

Here's my code (it's python3 but should be easily convertible to 2.7).

import subprocess
import shlex
import time
from threading import Thread
import os
import numpy as np
import logging


class FFMPEGVideoReader(object):
    def __init__(self, rtsp_url: str, width:int=320, height:int=180) -> None:
        super().__init__()
        self.rtsp_url = rtsp_url
        self.width = width
        self.height=height
        self.process = None
        self._stdout_reader = Thread(target=self._receive_output, name='stdout_reader', daemon=True)
        self._stdout_reader.start()
        self.frame_number = -1
        self._last_frame_read = -1

    def start_reading(self):
        if self.process is not None:
            self.process.kill()
            self.process = None
        # Customize your input/output params here
        command = 'ffmpeg -i {rtsp} -f rawvideo -r 4 -pix_fmt gray -vf scale={width}:{height} -'.format(rtsp=self.rtsp_url, width=self.width, height=self.height)
        logging.debug('Opening ffmpeg process with command "%s"' % command)
        args = shlex.split(command)
        FNULL = open(os.devnull, 'w')
        self.process = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=FNULL)

    def _receive_output(self):
        chunksize = self.width*self.height

        while True:
            while self.process is None:
                time.sleep(1)
            self._last_chunk = self.process.stdout.read(chunksize)
            self.frame_number += 1
    
    @property
    def frame(self):
        started = time.time()
        while self._last_frame_read == self.frame_number:
            time.sleep(0.125) # Put your FPS threshold here
            if time.time() - started > self.MAX_FRAME_WAIT:
                logging.warning('Reloading ffmpeg process...')
                self.start_reading()
                started = time.time()
        self._last_frame_read = self.frame_number

        dt = np.dtype('uint8')
        vec = np.frombuffer(self._last_chunk, dtype=dt)
        return np.reshape(vec, (self.height, self.width))


if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    vr = FFMPEGVideoReader('rtsp://192.168.1.10:554/onvif2', width=320, height=180)
    vr.start_reading()

    while True:
        print('update')
        fr = vr.frame
        np.save('frame.npy', fr)

If you need color images, thenyou need to change the pix_fmt in the ffmpeg's command, reading (width * height * channels) bytes, and then reshaping it correctly to one more axis.

score 1 · Answer 2 · answered Jan 30 '22 at 21:07

Another option would be to have an opencv VideoWriter encoding H264 frames and sending to shmsink:

h264_shmsink = cv2.VideoWriter("appsrc is-live=true ! queue ! videoconvert ! video/x-raw, format=BGRx ! nvvidconv ! "
      "nvv4l2h264enc insert-sps-pps=1 ! video/x-h264, stream-format=byte-stream ! h264parse ! shmsink socket-path=/tmp/my_h264_sock ",
     cv2.CAP_GSTREAMER, 0, float(fps), (int(width), int(height)))

where width and height are the sizes of pushed frames, and then use shmsrc doing timestamp as source for test-launch RTSP server such as:

./test-launch "shmsrc socket-path=/tmp/my_h264_sock do-timestamp=1 ! video/x-h264, stream-format=byte-stream, width=640, height=480, framerate=30/1 ! h264parse ! video/x-h264, stream-format=byte-stream ! rtph264pay pt=96 name=pay0 "

This may have some system overhead, but may work for low bitrate, or may require some optimization for higher bitrates.

score 0 · Answer 3 · answered Dec 20 '21 at 16:40

Not tried, but you may try:

replace nvv4l2h264enc by omxh264enc as it seems better. I had shortly experimented with UDP streaming 9 and found that default profile used by nvv4l2h264enc was higher than omxh264enc in my case, and even setting the fastest preset, it was loosing sync through UDP while omxh264enc was keeping it (maybe I missed some options). add h264parse after h264 encoder add rtph264pay config-interval=1 between h264parse and rstpclientsink. add queue after appsrc. If it works, feel free to try removing any useless part.

Also note, if upgrading your opencv version/build that VideoWriter API in python may have changed (with second argument being now the used API, such as cv2.CAP_GSTREAMER or cv2.CAP_ANY), but it doesn’t seem to be your case since you have a working case.

Hope I helped

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, Dec 20 '21 at 17:21

Create rtsp stream based on opencv images in python

3 Answers3