12

I am trying to build a client server architecture where I am capturing the live video from user's webcam using getUserMedia(). Now instead of showing video directly in <video> tag, I want to send it to my flask server, do some processing on frames and throw it back to my web page.

I have used socketio for creating a client-server connection. This is the script in my index.html. Please pardon my mistakes or any wrong code.

<div id="container">
    <video autoplay="true" id="videoElement">

    </video>
</div>
<script type="text/javascript" charset="utf-8">

    var socket = io('http://127.0.0.1:5000');

    // checking for connection
    socket.on('connect', function(){
      console.log("Connected... ", socket.connected)
    });

    var video = document.querySelector("#videoElement");


    // asking permission to access the system camera of user, capturing live 
    // video on getting true.

    if (navigator.mediaDevices.getUserMedia) {
      navigator.mediaDevices.getUserMedia({ video: true })
        .then(function (stream) {

          // instead of showing it directly in <video>, I want to send these frame to server

          //video_t.srcObject = stream

          //this code might be wrong, but this is what I want to do.
          socket.emit('catch-frame', { image: true, buffer: getFrame() });
        })
        .catch(function (err0r) {
          console.log(err0r)
          console.log("Something went wrong!");
        });
    }

    // returns a frame encoded in base64
    const getFrame = () => {
        const canvas = document.createElement('canvas');
        canvas.width = video_t.videoWidth;
        canvas.height = video_t.videoHeight;
        canvas.getContext('2d').drawImage(video_t, 0, 0);
        const data = canvas.toDataURL('image/png');
        return data;
    }


    // receive the frame from the server after processed and now I want display them in either 
    // <video> or <img>
    socket.on('response_back', function(frame){

      // this code here is wrong, but again this is what something I want to do.
      video.srcObject = frame;
    });

</script>

In my app.py -

from flask import Flask, render_template
from flask_socketio import SocketIO, emit

app = Flask(__name__)
socketio = SocketIO(app)

@app.route('/', methods=['POST', 'GET'])
def index():
    return render_template('index.html')

@socketio.on('catch-frame')
def catch_frame(data):

    ## getting the data frames

    ## do some processing 

    ## send it back to client
    emit('response_back', data)  ## ??


if __name__ == '__main__':
    socketio.run(app, host='127.0.0.1')

I also have thought to do this by WebRTC, but I am only getting code for peer to peer.

So, can anyone help me with this? Thanks in advance for help.

akan
  • 421
  • 1
  • 4
  • 13
  • What part of the scripts that you shared you need help with? Please add details on what's not working on this code. – Miguel Grinberg Nov 20 '19 at 15:08
  • I am facing problem in the part where I am trying to send the stream to the to the server using `socket.emit('catch-frame', { image: true, buffer: getFrame() });`. When I tried to get the stream in `catch_frame(data)` like this `frame = data`, I didn't receive any frame. Also I have to send the frame from server to the client back after processing. But since there are no frames, I am not getting any in `socket.on('response_back', function(frame)` from where I can source it to ` – akan Nov 22 '19 at 05:03
  • What was the value of `data` then? – Miguel Grinberg Nov 22 '19 at 09:35
  • It's giving me an object with value `{image: true, buffer: "data:,"}`. No frames – akan Nov 22 '19 at 10:07
  • Any did you print `data` on the client side to confirm it actually contained image data? The most obvious explanation is that your client is sending `"data:,"`, which I guess is a data URL for an empty frame. – Miguel Grinberg Nov 23 '19 at 11:23
  • Hey @akan, Can I ask that your frontend was in which language? React or Angular? I am trying to achieve same thing but no luck till now :). Thanks. – Bilal Shafqat Feb 05 '22 at 18:42

2 Answers2

17

So, what I was trying to do is to take the real time video stream captured by the client's webcam and process them at backend.

My backend code is written in Python and I am using SocketIo to send the frames from frontend to backend. You can have a look at this design to get a better idea about what's happening - image

  1. My server(app.py) will be running in backend and client will be accessing index.html
  2. SocketIo connection will get establish and video stream captured using webcam will be send to server frames by frames.
  3. These frames will be then processed at the backend and emit back to the client.
  4. Processed frames coming form the server can be shown in img tag.

Here is the working code -

app.py

@socketio.on('image')
def image(data_image):
    sbuf = StringIO()
    sbuf.write(data_image)

    # decode and convert into image
    b = io.BytesIO(base64.b64decode(data_image))
    pimg = Image.open(b)

    ## converting RGB to BGR, as opencv standards
    frame = cv2.cvtColor(np.array(pimg), cv2.COLOR_RGB2BGR)

    # Process the image frame
    frame = imutils.resize(frame, width=700)
    frame = cv2.flip(frame, 1)
    imgencode = cv2.imencode('.jpg', frame)[1]

    # base64 encode
    stringData = base64.b64encode(imgencode).decode('utf-8')
    b64_src = 'data:image/jpg;base64,'
    stringData = b64_src + stringData

    # emit the frame back
    emit('response_back', stringData)

index.html

<div id="container">
    <canvas id="canvasOutput"></canvas>
    <video autoplay="true" id="videoElement"></video>
</div>

<div class = 'video'>
    <img id="image">
</div>

<script>
    var socket = io('http://localhost:5000');

    socket.on('connect', function(){
        console.log("Connected...!", socket.connected)
    });

    const video = document.querySelector("#videoElement");

    video.width = 500; 
    video.height = 375; ;

    if (navigator.mediaDevices.getUserMedia) {
        navigator.mediaDevices.getUserMedia({ video: true })
        .then(function (stream) {
            video.srcObject = stream;
            video.play();
        })
        .catch(function (err0r) {
            console.log(err0r)
            console.log("Something went wrong!");
        });
    }

    let src = new cv.Mat(video.height, video.width, cv.CV_8UC4);
    let dst = new cv.Mat(video.height, video.width, cv.CV_8UC1);
    let cap = new cv.VideoCapture(video);

    const FPS = 22;

    setInterval(() => {
        cap.read(src);

        var type = "image/png"
        var data = document.getElementById("canvasOutput").toDataURL(type);
        data = data.replace('data:' + type + ';base64,', ''); //split off junk 
        at the beginning

        socket.emit('image', data);
    }, 10000/FPS);


    socket.on('response_back', function(image){
        const image_id = document.getElementById('image');
        image_id.src = image;
    });

</script>

Also, websockets runs on secure origin.

akan
  • 421
  • 1
  • 4
  • 13
  • Hello, I am not able to get the data from client side to server. Can you elaborate the code a little? How is pimg used in the app.py code. And also how can I open the image as opencv image. Thanks – techieViN Sep 17 '20 at 08:56
  • Same @akan - I was unable to get this working when I tried. Could you please elaborate on the above, and/or post a link to a GitHub repo with the complete working code? – zen_of_python Sep 18 '20 at 02:46
  • @zen_of_python @TechieViN Thanks for pointing out that. I have edited my code to use the image with OpenCV. Hope this will solve the issue. `frame = cv2.cvtColor(np.array(pimg), cv2.COLOR_RGB2BGR)` – akan Sep 18 '20 at 10:01
  • Thanks.. and also what is sbuf used for ? – techieViN Sep 18 '20 at 10:05
  • @TechieViN That I had put for a check on the data I was receiving. The code will work even if you remove that. Also no need for Thanks. I am glad to help. If this helped you then you can upvote. :) – akan Sep 18 '20 at 10:14
  • Hi thanks a lot for this sample, I was able to get it to work but i have a few questions and doubts :- 1) I seem to get an initialization error of the type (index):54 Uncaught TypeError: cv.Mat is not a constructor at (index):54 2) setInteveral() doesn't seem to work, the image event doesn't seem to get hit. If I add a function that explicitly emits to the image event, only then do I get a message on the socket server. 3) canvasOutput.toDataURL(type) seems to contain a blank image I'm a little curious on why cap.read(src) is used when the image is extraced from data – fibonachoceres Sep 19 '20 at 19:22
  • 1
    I got it to work by passing the video element to the capture function from here :- http://appcropolis.com/blog/web-technology/using-html5-canvas-to-capture-frames-from-a-video/ . Not sure about the cv errors but I was able to tweak the code for my use case without cv.js – fibonachoceres Sep 19 '20 at 19:59
  • @fibonachoceres That's also another way to make this work. – akan Sep 21 '20 at 06:38
  • @akan : That's true, just wanted to point out that in the solution you posted there should be 2 video streams, one the webcamera that your browser has opened up and one that is being rendered as frames from each socket message being sent from the server. When I just use the code you provided I get only one video stream and that is the default. Could be because I can't get opencv to work? But if your opencv code isn't doing anything, why not remove it? neither cap/src/dest are linked to the second video element that you replace on socket response, the other stream is the default – fibonachoceres Sep 21 '20 at 07:32
  • Is there a way to reduce the latency? – user2010672 Dec 21 '20 at 21:55
  • Hi @akan I follow your code closely and only did minimal tweaking for functional but all i get is black image why is that? –  Mar 26 '21 at 09:58
  • Hello @Albert, this problem may occur for some reason. Check if the encoding or the decoding of the image frames are done properly. Also, check if you are receiving the images properly on the other end. Could you tell, what tweaks you had made in the code? – akan Apr 06 '21 at 05:46
  • The canvasOutput is empty. You can use drawImage() to get the image from the video. – tomasantunes Jan 14 '22 at 12:11
2

I had to tweak your solution a bit :-

I commented the three cv variables and the cap.read(src) statement, modified the following line

var data = document.getElementById("canvasOutput").toDataURL(type);

to

        var video_element = document.getElementById("videoElement")
        var frame = capture(video_element, 1)
        var data = frame.toDataURL(type);

Using the capture function from here :- http://appcropolis.com/blog/web-technology/using-html5-canvas-to-capture-frames-from-a-video/

I'm not sure if this is the right way to do it but it happened to work for me.

Like I said I'm not super comfortable with javascript so instead of manipulating the base64 string in javascript, I'd much rather just send the whole data from javascript and parse it in python this way

# Important to only split once
headers, image = base64_image.split(',', 1) 

My takeaway from this, at the risk of sounding circular, is that you can't directly pull an image string out of a canvas that is containing a video element, you need to create a new canvas onto which you draw a 2D image of the frame you capture from the video element.

fibonachoceres
  • 727
  • 4
  • 14
  • This solution worked for me. Did you find a way to reduce the latency though? – 24dinitrophenylhydrazine Aug 19 '21 at 07:17
  • unfortunately the latency was something I couldn't figure out – fibonachoceres Dec 08 '21 at 05:55
  • You should be using `toBlob` instead of `toDataURL`. `toBlob` is asynchronous whereas `toDataURL` is _sloooowww_ and it blocks the JavaScript thread. However this approach is silly anyway, just use WebRTC instead, e.g. https://stackoverflow.com/questions/63549278/opencv-python-modeling-server-with-webrtc – Dai Jul 08 '22 at 18:28
  • can somebody share the full code as i am facing the same issue – Sourav Singh Jul 23 '23 at 17:22