Pose detection on two videos simultaneously in a browser web app not working

Question

I have written the following web app to perform pose detection on two videos. The idea is to, say, give a benchmark video in the first and a user video (either a pre-recorded one or their webcam feed) in the second, and compare the movements of the two.

import dash, cv2
import dash_core_components as dcc
import dash_html_components as html
import mediapipe as mp
from flask import Flask, Response

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose

class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False
          
            # Make detection
            results = pose.process(image)
        
            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
            
            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                        mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=2), 
                                        mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2) 
                                     )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


def gen(camera):
    while True:
        frame = camera.get_frame()
        yield (b'--frame\r\n'
               b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')

server = Flask(__name__)
app = dash.Dash(__name__, server=server)

@server.route('/video_feed_1')
def video_feed_1():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

@server.route('/video_feed_2')
def video_feed_2():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

app.layout = html.Div([
    html.Img(src="/video_feed_1", style={'width' : '40%', 'padding': 10}),
    html.Img(src="/video_feed_2", style={'width' : '40%', 'padding': 10})
])

if __name__ == '__main__':
    app.run_server(debug=True)

However, when I run this code, the fans on my laptop start to kick in and it doesn't render anything in the browser. It works fine with any video, but it seems to be able to handle only one video. You can remove either of the two functions video_feed_1() or video_feed_2(), and you can also replace the video path from 0 (which is webcam) with the path to any other video (like, /path/to/video.mp4), and it works fine.

Also, when I simply display two videos in the browser, that too works fine. You can try this out too by replacing the get_frame() function in the class above with the following:

def get_frame(self):
    success, image = self.video.read()
    ret, jpeg = cv2.imencode('.jpg', image)
    return jpeg.tobytes()

So, how do I reduce the load on the browser when rendering the pose estimation of two videos simultaneously? And why is the load so high anyway when rendering in browser, when it works perfectly fine when the pose estimations render by default on two pop-up windows (i.e., with cv.imshow(image))?

What does `VideoCamera(0)` do? You have that in both feed routes. — Daniel Butler, May 29 '21 at 02:37
@DanielButler `VideoCamera` takes as argument the video path. So, `0` refers to the webcam. You can replace `0` with any other video like `path/to/video.mp4`, and it will open that video. — Kristada673, May 29 '21 at 02:42
Both feeds values were the same. I wasn’t sure if that’s the issue. — Daniel Butler, May 29 '21 at 02:44
@DanielButler No, that's not the issue. It should show the same webcam feed in both feeds. Which it does when I'm simply relaying the webcam feed instead of doing the pose detection on them, like I mentioned in the question. I just gave the same value of `0` in both feeds here as I'd otherwise need to upload a video somewhere and put the link here, which I didn't want to do. — Kristada673, May 29 '21 at 02:46
running 2 videos worked fine on my end. You can try making pose = mp_pose.Pose a global variable instead of initializing 2 instances in the class. idk if that will help. When i tried using 1 webcam & 1 video the browser will freeze. I couldnt get 2 webcam feeds to work at all. Im guessing the problems could be a threading issue with the Response & gen are trying to get the images. can you use a different method of getting the images as a response using dash/flask. — Ta946, May 31 '21 at 18:17
I tried on my laptop on got the exact same result. I tested using Quart instead of Dash/Flask and the result is much better but still not perfect, you might want to try it out if you don't rely too much on dash components. Also using the webcam stream is far more costly than using a video file so unless required to you should preferably not use it for both video feeds even for testing (on my side it will display only one cam with both dash/quart while using `VideoCamera(0)` + `VideoCamera(video_path)` works fine with quart). — EricLavault, Jun 03 '21 at 17:07
Why do you catch frames at the server side? Is that correct all users see same frames? — Epic Chen, Jun 05 '21 at 03:19
I didn't catch it in the first place and maybe you didn't too: I was wondering how to handle each request in a separate thread and realized that dash/flask can do it natively, there is precisely a flag to define whether or not a process should handle each request in a separate thread : `app.run_server(threaded=True)`. I leave a comment because you possibly already tried that and see no big difference, and because on my side even if there is a difference (it finally outputs both videos), it still lags too much I think. Again, Quart does it smoothly (see my previous comment). — EricLavault, Jun 05 '21 at 15:24
You can also try to increase the number of processes the same way with keyword arg. `processes=n` (handle each request in a new process up to n concurrent processes), I can't tell right now my machine. — EricLavault, Jun 05 '21 at 15:31
@EricLavault Yes, i did try `threaded=True`. It didn't yield much better results - now the fans don't kick on, but the webcam feed works for 3-5 seconds, and then hangs. I did not try `processes` before; tried just now, but can't get it to work. This is the error I get: `The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.` — Kristada673, Jun 05 '21 at 15:40

emher · Accepted Answer · 2021-06-06T17:45:25.397

For a task that requires real time updates like your pose estimation, I would recommend using websockets for communication. Here is a small example where a Quart server streams the data via websockets to a Dash frontend,

import asyncio
import base64
import dash, cv2
import dash_html_components as html
import mediapipe as mp
import threading

from dash.dependencies import Output, Input
from quart import Quart, websocket
from dash_extensions import WebSocket

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose


class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False

            # Make detection
            results = pose.process(image)

            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                      mp_drawing.DrawingSpec(color=(245, 117, 66), thickness=2, circle_radius=2),
                                      mp_drawing.DrawingSpec(color=(245, 66, 230), thickness=2, circle_radius=2)
                                      )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


# Setup small Quart server for streaming via websocket, one for each stream.
server = Quart(__name__)
n_streams = 2


async def stream(camera, delay=None):
    while True:
        if delay is not None:
            await asyncio.sleep(delay)  # add delay if CPU usage is too high
        frame = camera.get_frame()
        await websocket.send(f"data:image/jpeg;base64, {base64.b64encode(frame).decode()}")


@server.websocket("/stream0")
async def stream0():
    camera = VideoCamera("./kangaroo.mp4")
    await stream(camera)


@server.websocket("/stream1")
async def stream1():
    camera = VideoCamera("./yoga.mp4")
    await stream(camera)


# Create small Dash application for UI.
app = dash.Dash(__name__)
app.layout = html.Div(
    [html.Img(style={'width': '40%', 'padding': 10}, id=f"v{i}") for i in range(n_streams)] +
    [WebSocket(url=f"ws://127.0.0.1:5000/stream{i}", id=f"ws{i}") for i in range(n_streams)]
)
# Copy data from websockets to Img elements.
for i in range(n_streams):
    app.clientside_callback("function(m){return m? m.data : '';}", Output(f"v{i}", "src"), Input(f"ws{i}", "message"))

if __name__ == '__main__':
    threading.Thread(target=app.run_server).start()
    server.run()

While this solution performs significantly better (on my laptop at least), the resource usage is still high, so I added a delay parameter that makes it possible to lower resource usage at the expense of frame rate reduction.

I get a weird error, which I am not able to understand why it arises or how to fix it, as I'm not familiar with websockets: https://user-images.githubusercontent.com/39755678/120928477-5b9b1f00-c717-11eb-9d1f-97d1d7f7ef5a.png — Kristada673, Jun 06 '21 at 14:36
Your error seems to be related to a Python 3.8 regression. Could you try 3.7 (that is what I used) or 3.9? https://stackoverflow.com/questions/60359157/valueerror-set-wakeup-fd-only-works-in-main-thread-on-windows-on-python-3-8-wit — emher, Jun 06 '21 at 15:30
I upgraded my python to 3.9 just now, and reinstalled the libraries needed for this project. I still get the same error. And unfortunately I can't downgrade to 3.7 as I have past experience of some projects not working well in 3.7. — Kristada673, Jun 06 '21 at 16:08
Can't, because I'm on Mac OS, not on Windows, and the suggested fix is for Windows only :( — Kristada673, Jun 06 '21 at 17:03
Ah, yes, I just noticed. As an alternative fix, I have changed the example to run the quart server in the main thread and the flask server in a separate thread, which should solve the issue (it did on Python 3.8 running in Ubuntu 20.04 via WSL). — emher, Jun 06 '21 at 17:33
You mean you edited the code here in your answer? Because I didn't notice any change in your code here, but I still copy-pasted your code and run it (after changing the video paths - kangaroo and yoga - to mine), and I still get the same error. — Kristada673, Jun 06 '21 at 17:43
Yes, i edited it just now. Only the two last lines were changed. From "threading.Thread(target=server.run).start() app.run_server()" to "threading.Thread(target=app.run_server).start() server.run()" — emher, Jun 06 '21 at 17:46
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/233392/discussion-between-emher-and-kristada673). — emher, Jun 06 '21 at 17:50
Ah, it works now! Thanks a ton. I'll deep-dive into how this asyncio and websocket are working tomorrow, as I've never used these before. — Kristada673, Jun 06 '21 at 17:52
@emher How can I modify the clientside callback function to also take an additional input from a dash component for video path? — Atharva Katre, Oct 24 '21 at 09:05

Pose detection on two videos simultaneously in a browser web app not working

1 Answers1

Linked