Grabbing frames from H264 websocket stream in Python

Question

I'm working on extracting a single frame from a raw H264 video stream from a vendor system that has an API to stream live video through a websocket. However, I'm running into the issue of figuring out how to read h264 streams to obtain a single frame. I tried following this question here, but to no avail. My larger objective is to retrieve a single frame and run an object detection on the frame with OpenCV.

Below is the code using the decode_image from the prior link, but the function is returning a None value. The data itself also has a start code 0x000001 after stripping out the overhead information, but the size of the data is also smaller than the expected frame size (864x480 should equal a byte size of 414720, but the result I'm getting is a seemingly random size per message, all significantly smaller than 414720).

I'm not sure what I'm missing to convert the H264 stream into a usable image. Any help is appreicated.

import requests
import cv2
import sys
import av
import asyncio
import websockets

global stream
stream = []

def decode_image(raw_bytes: bytes):
    code_ctx = av.CodecContext.create("h264", "r")
    packets = code_ctx.parse(raw_bytes)
    for i, packet in enumerate(packets):
        frames = code_ctx.decode(packet)
        if frames:
            return frames[0].to_image()

async def websocket_connect(cookies):
    async with websockets.connect('wss://****:7001/api/ws', extra_headers={"Cookie":cookies}) as ws:
        print('### Waiting for send Command ###')
        await ws.send('&site=***&camera=2&raw=true')
        print('### Send command sent, set the site location and camera number ###')

        async for message in ws:
            #Need to filter out the data to just what I want
            global stream
            SOCKET_MESSAGE = get_socket_message(message)
            if SOCKET_MESSAGE['msg_type'] == 0:
                VIDEO_DATA = get_video_data(message)
                stream = VIDEO_DATA['data']
        await ws.close()

async def videodata_print():
    global stream
    while True:
        print('### RUNNING INSIDE VIDEODATA_PRINT ###')
        #Check that VIDEO_DATA has values
        print(f'\nSTREAM: {stream}\n')
        if len(stream) != 0:
            tmp = stream
            frame = decode_image(tmp)
            print(frame)
            cv2.imshow('stream',frame)
            #If frame works here, then add rest of code to get the info I need out of that frame
        await asyncio.sleep(5)

async def run_livestream(cookies):
    await asyncio.gather(websocket_connect(cookies), videodata_print())


#Grab authentication token of server, need to pass cookie with websocket to connect
r = check_Authentication(CREDENTIAL)
cookies = f"token={r.cookies['token']}"
print(cookies)

asyncio.run(run_livestream(cookies))

Because it's a stream, you don't get one frame at a time. You get a stream of bytes that happen to contain frames. H264 is highly compressed, and frames can be based on previous frames, so the frame size is highly variable. You won't get a video image out until the decoder has seen all the bytes for a frame. Are you getting data from `code_ctx.parse`? — Tim Roberts, Apr 11 '23 at 19:34
I wasn't getting any data from code_ctx.parse, but after reading your statement, I edited the input to feed in a random length of the stream, and now I am able to retrieve the first frame using the same code. — fysloc, Apr 11 '23 at 20:24

score 0 · Answer 1 · answered Apr 11 '23 at 20:30

Thanks to Tim's comment, I was able to resolve the issue.

Issue was that the input data only sent 1 message at a time from the websocket, where as I needed to store several messages into memory, append it as a byte array, then pass it to the function.

Below was the change needed to store the stream and now I'm able to see the first frame.

async def websocket_connect(cookies):
    async with websockets.connect('wss://****:7001/api/ws', extra_headers={"Cookie":cookies}) as ws:
        print('### Waiting for send Command ###')
        await ws.send('&site=****&camera=2&raw=true')
        print('### Send command sent ###')

        async for message in ws:
            #await asyncprint(message)
            #Need to filter out the data to just what I want
            global stream
            SOCKET_MESSAGE = get_socket_message(message)
            if SOCKET_MESSAGE['msg_type'] == 0:
                VIDEO_DATA = get_video_data(message)
                stream += bytearray(VIDEO_DATA['data'])
                #print(VIDEO_DATA['data'])

        await ws.close()

Grabbing frames from H264 websocket stream in Python

1 Answers1