6

I would like to decode the H.264 video sequences and show them on the screen. The video sequences are from the pi camera and I capture with the following code

import io
import picamera

stream = io.BytesIO()
while True:
    with picamera.PiCamera() as camera:
        camera.resolution = (640, 480)
        camera.start_recording(stream, format='h264', quality=23)
        camera.wait_recording(15)
        camera.stop_recording()

Is there any way to decode the sequence of 'stream' data and show them with the opencv or other python libraries?

Aung Myo Htut
  • 61
  • 1
  • 1
  • 3

4 Answers4

7

I found a solution using ffmpeg-python.
I can't verify the solution in raspberry-pi, so I am not sure if it's going to work for you.

Assumptions:

  • stream holds the entire captured h264 stream in memory buffer.
  • You don't want to write the stream into a file.

The solution applies the following:

  • Execute FFmpeg in a sub-process with sdtin as input pipe and stdout as output pipe.
    The input is going to be the video stream (memory buffer).
    The output format is raw video frames in BGR pixel format.
  • Write stream content to the pipe (to stdin).
  • Read decoded video (frame by frame), and display each frame (using cv2.imshow)

Here is the code:

import ffmpeg
import numpy as np
import cv2
import io

width, height = 640, 480


# Seek to stream beginning
stream.seek(0)

# Execute FFmpeg in a subprocess with sdtin as input pipe and stdout as output pipe
# The input is going to be the video stream (memory buffer)
# The output format is raw video frames in BGR pixel format.
# https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
# https://github.com/kkroening/ffmpeg-python/issues/156
# http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/
process = (
    ffmpeg
    .input('pipe:')
    .video
    .output('pipe:', format='rawvideo', pix_fmt='bgr24')
    .run_async(pipe_stdin=True, pipe_stdout=True)
)


# https://stackoverflow.com/questions/20321116/can-i-pipe-a-io-bytesio-stream-to-subprocess-popen-in-python
# https://gist.github.com/waylan/2353749
process.stdin.write(stream.getvalue())  # Write stream content to the pipe
process.stdin.close()  # close stdin (flush and send EOF)


#Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while(True):
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break

    # transform the byte read into a numpy array
    in_frame = (
        np
        .frombuffer(in_bytes, np.uint8)
        .reshape([height, width, 3])
    )

    #Display the frame
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

process.wait()
cv2.destroyAllWindows()

Note: I used sdtin and stdout as pipes (instead of using named-pipes), because I wanted the code to work in Windows too.


For testing the solution, I created a sample video file, and read it into memory buffer (encoded as H.264).
I used the memory buffer as input to the above code (replacing your stream).

Here is the complete code, include the testing code:

import ffmpeg
import numpy as np
import cv2
import io

in_filename = 'in.avi'

# Build synthetic video, for testing begins:
###############################################
# ffmpeg -y -r 10 -f lavfi -i testsrc=size=160x120:rate=1 -c:v libx264 -t 5 in.mp4
width, height = 160, 120

(
    ffmpeg
    .input('testsrc=size={}x{}:rate=1'.format(width, height), r=10, f='lavfi')
    .output(in_filename, vcodec='libx264', crf=23, t=5)
    .overwrite_output()
    .run()
)
###############################################


# Use ffprobe to get video frames resolution
###############################################
p = ffmpeg.probe(in_filename, select_streams='v');
width = p['streams'][0]['width']
height = p['streams'][0]['height']
n_frames = int(p['streams'][0]['nb_frames'])
###############################################


# Stream the entire video as one large array of bytes
###############################################
# https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
in_bytes, _ = (
    ffmpeg
    .input(in_filename)
    .video # Video only (no audio).
    .output('pipe:', format='h264', crf=23)
    .run(capture_stdout=True) # Run asynchronous, and stream to stdout
)
###############################################


# Open In-memory binary streams
stream = io.BytesIO(in_bytes)

# Execute FFmpeg in a subprocess with sdtin as input pipe and stdout as output pipe
# The input is going to be the video stream (memory buffer)
# The ouptut format is raw video frames in BGR pixel format.
# https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
# https://github.com/kkroening/ffmpeg-python/issues/156
# http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/
process = (
    ffmpeg
    .input('pipe:')
    .video
    .output('pipe:', format='rawvideo', pix_fmt='bgr24')
    .run_async(pipe_stdin=True, pipe_stdout=True)
)


# https://stackoverflow.com/questions/20321116/can-i-pipe-a-io-bytesio-stream-to-subprocess-popen-in-python
# https://gist.github.com/waylan/2353749
process.stdin.write(stream.getvalue())  # Write stream content to the pipe
process.stdin.close()  # close stdin (flush and send EOF)


#Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while(True):
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break

    # transform the byte read into a numpy array
    in_frame = (
        np
        .frombuffer(in_bytes, np.uint8)
        .reshape([height, width, 3])
    )

    #Display the frame
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

process.wait()
cv2.destroyAllWindows()
Rotem
  • 30,366
  • 4
  • 32
  • 65
  • This is a very good answer and seems to be working perfectly. Would You have any suggestion as to how properly read live broadcasts by adjusting the code that You have posted? I seem to be unable to correctly implement it with ```streamlink``` for a live video. – Kaszanas Feb 09 '21 at 10:04
0

I don't think OpenCV knows how to decode H264, so you would have to rely on other libraries to convert it to RGB or BGR.

On the other hand, you can use format='bgr' in picamera and make your life easier:

karlphillip
  • 92,053
  • 36
  • 243
  • 426
0

I dont know what you exactly want to do but another way to do it without FFMPEG is this:

If you read the picam docs you will see that the video port has splitters which you can access using the splitter_port=x (1<=x<=3) kwarg for camera.start_recorder():
https://picamera.readthedocs.io/en/release-1.13/api_camera.html#picamera.PiCamera.start_recording

Basically this means that you can split the recorded stream into 2 substreams, one you encode to h264 for saving or whatever, and one where you enconde it to a OPENCV compatible format. https://picamera.readthedocs.io/en/release-1.13/recipes2.html?highlight=splitter#capturing-to-an-opencv-object

This all happens mainly in the GPU so it is pretty fast (see the picamera docs for more info)

It is the same as they do here if you need an example: https://picamera.readthedocs.io/en/release-1.13/recipes2.html?highlight=splitter#recording-at-multiple-resolutions but then with an opencv object and a h264 stream

n4321d
  • 1,059
  • 2
  • 12
  • 31
0

@Rotem 's answer is correct but it does not work with large video chunks.

To work with the larger video, we need to replace process.stdin.write the process.communicate. Updating the following lines

...
# process.stdin.write(stream.getvalue())  # Write stream content to the pipe
outs, errs = process.communicate(input=stream.getvalue())
# process.stdin.close()  # close stdin (flush and send EOF)
# Read decoded video (frame by frame), and display each frame (using cv2.imshow)

position = 0
ct = time.time()
while(True):
    # Read raw video frame from stdout as bytes array.
    in_bytes = outs[position: position + width * height * 3]
    position += width * height * 3
...
luckiday
  • 21
  • 2