Implementing a media device the browser can consume

Question

I'd like to create a media device intended for browser consumption. That is to say, I want to publish a video stream the browser can get through navigator.mediaDevices, send through WebRTC, put in a <video> tag.

In reality, what I'm doing is consuming a video stream (I glean this from a loop in C++ which spits out images), reading it and analyzing it, and I want to be able to send the stream to the browser. Ideally, I would like to be able to do this from a Docker container. The C++ process is also going to coexist as a Node binding, but I'm not sure if that's relevant. In this case, what I'm saying is that, if it's easiest to send the images/video stream through the binding's API and then publish from Node, I have no problem with that.

Can anyone provide documentation or reading material on how to register a faux device with wherever the browser is getting devices from? I'm not very familiar with drivers or anything.

While I think any good solution is going to be fairly cross-compatible with other systems, I only strictly need it to be compatible with Ubuntu 16.04 and Chrome.

This is just for your own local use, i.e. you can install drivers or rely on having browser plugins on the system? (as opposed to across the internet to any Chrome?) — Rup, May 31 '18 at 18:55
@Rup Strictly speaking, yes. I am going to be sending the media device through WebRTC from a browser on the same machine, but I don't believe that's relevant as long as the browser can properly consume it. I would really rather avoid modifying Chrome or Linux in any way, however. Happy to write a thin driver, which is what I've been trying to research. I just can't seem to find any good explanations of APIs or anything that I can use. — River Tam, May 31 '18 at 18:56
@Rup I don't mean to be a pest, but did you have any advice based on that? I'm still looking and coming up empty handed — River Tam, May 31 '18 at 20:54
Sorry, no, I’ve never done anything like this before. My gut feeling is that if this is possible with a plug-in then that will be simpler but I suspect it’s not. Does this have to go through mediaDevices? It’s not obvious to me what the benefit is over a normal over-the-web-like video stream. — Rup, May 31 '18 at 21:00
I'm not totally sure what you mean by an "over-the-web-like video stream", but if you're referring to a hosted video, like say an mp4, I don't know if I can stream that to the browser and I don't know if I can convert that to a format WebRTC can consume later even if I can, but I'd be open to it if it works for my usecase. — River Tam, May 31 '18 at 21:16
This continues to be a saga, but I'm making progress. I found [this question](https://stackoverflow.com/questions/10431588/how-to-write-on-a-virtual-webcam-in-linux) which pointed me to [this project](https://github.com/umlaeute/v4l2loopback). However, I'm running into [this issue](https://github.com/umlaeute/v4l2loopback/issues/183), so the saga continues. Hopefully I'll be able to answer this really well soon! — River Tam, May 31 '18 at 23:06

score 3 · Accepted Answer · answered Jun 13 '18 at 20:44

The architecture I ended up with is as follows:

C++ using FFmpeg (libavdevice/libavcodec/libavformat etc.) feeds into the device, which I created using v4l2loopback. Then Chrome can detect this pseudo-device. (as long as you use exclusive_caps=1 option as shown below)

So the first thing I do is set up the v4l2loopback device. This is a faux device that will output like a normal camera, but it will also take input like a capture device or similar.

git clone https://github.com/umlaeute/v4l2loopback
cd v4l2loopback
git reset --hard b2b33ee31d521da5069cd0683b3c0982251517b6 # ensure v4l2loopback functionality changes don't affect this script
make
sudo insmod v4l2loopback.ko exclusive_caps=1 video_nr=$video_nr card_label="My_Fake_Camera"

The browser will see the device in navigator.mediaDevices.enumerateDevices() when and only when you're publishing to it. To test that it's working before you feed to it through C++, you can use ffmpeg -re -i test.avi -f v4l2 /dev/video$video_nr. For my needs, I'm using Puppeteer, so it was relatively easy to test, but keep in mind a long-lasting browser session will cache the devices and refresh them somewhat infrequently, so make sure test.avi (or any video file) is quite long (1 min+) so you can try to reset your environment fully. I've never figured out what the caching strategy is exactly, so Puppeteer turned out to be very helpful here, but I had already been using it, so I didn't have to set it up. YMMV.

Now (for me) the hard part was getting FFmpeg (libav-* version 2.8) to output to this device. I can't/won't share all my code, but here are the parts and some guiding wisdom:

Set up:

Create an AVFormatContext using avformat_alloc_output_context2(&formatContext->pb, NULL, "v4l2", "/dev/video5")
Set up the AVCodec using avcodec_find_encoder and create an AVStream using avformat_new_stream
There are a bunch of little flags you should be setting, but I won't walk through all of them in this answer. This snippet as well as some others include a lot of this work, but they're all geared towards writing to disk rather than to a device. The biggest thing you need to change is creating the AVFormatContext using the device rather than the file (see first step).

For each frame:

Convert your image to the proper colorspace (mine is BGR, which is OpenCV's default) using OpenCV's cvtColor
Convert the OpenCV matrix to a libav AVFrame (using sws_scale)
Encode the AVFrame into an AVPacket using avcodec_encode_video2
Write the packet to the AVFormatContext using av_write_frame

As long as you do all this right, it should feed it into the device and you should be able to consume your video feed in the browser (or anywhere that detects cameras).

The one thing I'll add that's necessary specifically for Docker is that you have to make sure to share the v4l2 device between the host and the container, assuming you're consuming the device outside of the container (which I am). This means you'll run the docker run command with --device=/dev/video$video_nr.

I will add that I'm currently experiencing a blue-shift where the image is coming out bluer than it does naturally without this method (just reading from the camera directly). This might be a problem with one of the libraries, but it might also just be a problem with omitted implementation or at least have a solution in C++. For our needs, a slight blue-shift is currently acceptable, so it's not something that I've resolved. — River Tam, Jun 13 '18 at 20:48
Small update: The blue shift was not a factor of what I represented above. I never did fix it, but changing to another camera did fix it, so it must have been an issue with the camera API/my integration of it for the camera we were using. — River Tam, Oct 01 '18 at 18:07

Implementing a media device the browser can consume

1 Answers1