WebCodecs > VideoEncoder: Create video from encoded frames

Question

I would like to create a video file from multiple images uploaded to my site.

Until now, what I do is take these images, draw them 1-by-1 on a canvas, and use the MediaRecorder API to record them. However, there is a lot of idle time.

Instead, I want to use the VideoEncoder API.

I created an encoder that saves every chunk as a buffer:

const chunks = [];

let encoder = new VideoEncoder({
  output: (chunk) => {
    const buffer = new ArrayBuffer(chunk.byteLength)
    chunk.copyTo(buffer);
    chunks.push(buffer);
  },
  error: (e) => console.error(e.message)
});

And configured it with my settings:

encoder.configure({
  codec: 'vp8',
  width: 256,
  height: 256,
  bitrate: 2_000_000,
  framerate: 25
});

Then, I encode every image as a frame:

const frame = new VideoFrame(await createImageBitmap(image));
encoder.encode(frame, {keyFrame: true});
frame.close();

And finally, I try to create a video from it:

await encoder.flush();

const blob = new Blob(chunks, {type: 'video/webm; codecs=vp8'});
const url = URL.createObjectURL(blob);

However, that URL blob is unplayable. If I try to download it, VLC does not show it. If I set it as the source for a video element, I get:

DOMException: The element has no supported sources.

How do I encode multiple frames into a video that is playable?

How do I know which codecs / blob types are supported?

Minimal Reproduction

The following codepen is the above code, concatenated and joined into a single function. https://codepen.io/AmitMY/pen/OJxgPoG?editors=0010

Do you have a complete testable code to **recreate** your problem? Or maybe provide a link to the output video (downloaded blob) so we check what's wrong with the encoding (_eg:_ a possible missing webM header). — VC.One, Dec 18 '21 at 17:22

msaw328 · Accepted Answer · 2022-11-15T16:46:42.173

VideoEncoder and other classes from the WebCodecs API provide you with the way of encoding your images as frames in a video stream, however encoding is just the first step in creating a playable multimedia file. A file like this may potentially contain multiple streams - for instance when you have a video with sound, that's already at least one video and one audio stream, so a total of two. You need additional container format to store the streams so that you do not have to send the streams in separate files. To create a container file from any number of streams (even just one) you need a multiplexer (muxer for short). Good summary of the topic can be found in this Stack Overflow answer, but to quote the important part:

When you create a multimedia file, you use a coder algorithms to encode the video and audio data, then you use a muxer to put the streams together into a file (container). To play the file, a demuxer takes apart the streams and feeds them into decoders to obtain the video and audio data.

Codec means coder/decoder, and is a separate concept from the container format. Many container formats can hold lots of different types of format (AVI and QuickTime/MOV are very general). Other formats are restricted to one or two media types.

You may think "i have only one stream, do i really need a container?" but multimedia players expect received data (either data read from a file or streamed over network) to be in a container format. Even if you have only one video stream, you still need to pack it into a container for them to recognize it.

Joining the byte buffers into one big blob of data will not work:

const blob = new Blob(chunks, {type: 'video/webm; codecs=vp8'});

Here you try to glue all the chunks together and tell the browser to interpret it as a WebM video (video/webm MIME type) but it can't do it, as it is not properly formatted. This in turn is the source of the error. To make it work, you have to append relevant metadata to your chunks (usually formated as buffers of binary data with specific format depending on the type of a container as well as codec) and pass it to a muxer. If you use a library for muxing that is designed to work with raw streams of video (for example, those coming from WebCodecs API) then it will probably handle the metadata for you. As a programmer you most likely will not have to deal with this manually, however if you want to understand more about the whole process then i suggest you read about metadata present in various container formats (for example, VC.Ones comments below this answer).

Sadly, muxers do not seem to be a part of the WebCodecs API as of now. Example in the official repository of the API uses the muxAndSend() function as the encoder output callback:

const videoEncoder = new VideoEncoder({
  output: muxAndSend,
  error: onEncoderError,
});

And above in the code we can see that this function needs to be supplied by the programmer (original comments):

// The app provides a way to serialize/containerize encoded media and upload it.
// The browser provides the app byte arrays defined by a codec such as vp8 or opus
// (not in a media container such as mp4 or webm).
function muxAndSend(encodedChunk) { ... };

Here is a link to a discussion about adding muxing support to browsers and here is an issue in the official repo tracking this feature. As of now, there does not seem to be a built in solution for your problem.

To solve it you could possibly use a third party library such as mux.js or similar (here is a link to their "Basic Usage" example which may help you). Alternatively, this project claims to create WebM containers out of VideoEncoder encoded data. This excerpt from the description of their demo seems to be exactly what you wanted to achieve (except with a webcam as the VideoFrame source, instead of a canvas):

When you click the Start button, you’ll be asked by the browser to give permission to capture your camera and microphone. The data from each is then passed to two separate workers which encode the video into VP9 and audio into Opus using the WebCodecs browser API.

The encoded video and audio from each worker is passed into a third worker which muxes it into WebM format.

I cannot provide you with a code sample as i have not used any of mentioned libraries myself, but i am sure that after understanding the relation between encoders and muxers you should be able to solve the problem on your own.

EDIT: I have found another library which might help you. According to their README:

What's supported:

MP4 video muxing (taking already-encoded H264 frames and wrapping them in a MP4 container)

MP4/H264 encoding and muxing via WebCodecs

Many libraries and sources i find online seem to be WASM-based, usually implemented in C or another language compiling to native machine code. This is probably due to the fact that large libraries exist (first thing that comes to mind is ffmpeg) which deal with all sorts of media formats, and this is what they are written in. JS libraries are often written as bindings to said native code to avoid reinventing the wheel. Additionally, i would assume that performance may also be a factor.

Disclaimer: While you used video/webm as the MIME type in your code sample, you did not explicitly state what file format do you want your output to be, so i allowed myself to reference some libraries which produce other formats.

EDIT 2:

David Kanal's answer below provides another example of a library which could be used for muxing WebM.

I will upvote since it's correct that he needs a container format for the keyframe data. What's wrong/missing is **(1)** The belief that these WASM based codes are needed for **muxing** (can be done in pure Javascript). They are implemented in C not for speed but because they are using pre-existing C code like FFmpeg's or similar to power their abilities. WebCodecs is **exactly** meant to replace the need for these WASM workarounds when encoding. — VC.One, Dec 23 '21 at 09:23
**(2)** Before muxing anything his raw keyframes need their format's metadata. For example: A **VP8** keyframe needs a VP8 or **webP** header before muxing into webM. To make one he needs to only create an Array of 20 values (bytes) then also copy/paste in the blob's own array values after these 20 values. _Eg:_ `52 49 46 46 AA AA AA AA 57 45 42 50 56 50 38 20 BB BB BB BB` is where you replace the four values **0xAA** with **12 + SIZE** of keyframe bytes (as 32-bit integer) and four **0xBB** is just **SIZE** of keyframe. Size means length of array. At this point data is now muxed into webP. — VC.One, Dec 23 '21 at 09:43
**(3)** A similar setup can also be used for H.264 keyframes. For that you need around 40 bytes for the **SPS** and **PPS** etc which any MP4 muxer will expect to exist in an H264 stream. The SPS will contain numbers like frame width/height that are transferred to MP4 header when it is created. WebCodecs does not make SPS and PPS (in JS you can write your own Array values, based on your canvas size etc)... So that is what's missing, a notice that Asker still needs to prepare raw keyframe data **also with** it's expected metadata (_eg:_ a **webP header** or **H.264 header**) before containing. — VC.One, Dec 23 '21 at 09:58
Thanks for valuable info @VC.One. To address your points: (1) is something i forgot to mention and will add to my answer shortly. About (2) and (3) i assumed that libraries providing muxer functionality will handle metadata to be able to work with WebCodecs produced output. Checking one of them i have found that the output callback of the encoder [does call a function](https://github.com/mattdesl/mp4-wasm/blob/master/src/extern-post.js#L166) named `writeAVC()` which seems to write SPS and PPS metadata into a buffer. Only after that, the data is sent to the actual muxer. — msaw328, Dec 23 '21 at 10:55
I also assume that if muxing API becomes part of the standard, the API will handle metadata as well to work seamlessly with the WebCodecs. Because of this i allowed myself to mention metadata and formatting only briefly. I tried focusing more on the programming problem, while explaining underlying concepts without much detail. Despite that, i should probably mention in the answer that there is more to the topic than just what i described, which i will do shortly as well. — msaw328, Dec 23 '21 at 11:12
I just recently coded https://github.com/Vanilagy/webm-muxer which is a fully featured video + audio muxer for WebM files, in pure TypeScript (no hefty wasm runtime). Would you mind adding it to your post? — DavidsKanal, Nov 10 '22 at 19:13

score 5 · Answer 2 · edited Jun 09 '23 at 15:49

Update (2023-04-13):

Made a muxer for MP4: https://github.com/Vanilagy/mp4-muxer

Update (2022-11-10):

As the libraries I found for this topic were insufficient for my needs, I created my own: https://github.com/Vanilagy/webm-muxer

This is a full-featured WebM muxer (video + audio) in pure TypeScript requiring no hefty wasm files. Usage is explained in great detail in the README. This library powers a video recording feature in my browser-based game.

Thought I'd drop my two cents on this topic, as I recently struggled with the exact same thing the OP mentioned.

I managed to find a solution to render and export WebM files, albeit without audio.

I found an official example from W3C here: https://w3c.github.io/webcodecs/samples/capture-to-file/capture-to-file.html. It captures your webcam's video stream and saves it as a .webm file on your disk. Diving into the code, the code responsible for taking encoded video chunks and writing (muxing) them into a playable WebM is webm-writer2.js

With that file included in the site, all one needs to do to write a WebM file is this:

// Acquire `fileHandle` somewhere, I use
// https://developer.mozilla.org/en-US/docs/Web/API/Window/showSaveFilePicker

let fileWritableStream = await fileHandle.createWritable();

// This WebMWriter thing comes from the third-party library
let webmWriter = new WebMWriter({
    fileWriter: fileWritableStream,
    codec: 'VP9',
    width: width,
    height: height
});

let encoder = new VideoEncoder({
    output: chunk => webmWriter.addFrame(chunk),
    error: e => console.error(e)
});
// Configure to your liking
encoder.configure({
    codec: "vp09.00.10.08",
    width: width,
    height: height,
    bitrate: bitrate,
    latencyMode: 'realtime'
});

Then, simply pump frames into the encoder as usual using encoder.encode(videoFrame).

Hope this helps someone.

score 2 · Answer 3 · answered Apr 18 '22 at 12:36

Like msaw328 says, you have to add a few format-specific bytes to your raw encoded chunk blob before getting a file. But the browser already knows how to do this! The question becomes, how can I tell the browser to do this?

Well, with captureStream, you can get a stream from what's happening in a canvas, and use MediaRecorder to record this stream, I explain how to do this in this answer. That's what you already did, and it has two issues:

if drawing stuff on canvas takes less that 1/60s, we're making the user wait for nothing
if drawing stuff on canvas takes more than 1/60s, the output video is gonna be all slowed down

So another setup we can have is to not use VideoEncoder directly, but rather use MediaStreamTrackGenerator to generate a stream from raw VideoFrames, and pass the stream to MediaRecorder. All in all it looks like this:

(async () => {
  // browser check
  if (typeof MediaStreamTrackGenerator === undefined || typeof MediaStream === undefined || typeof VideoFrame === undefined) {
    console.log('Your browser does not support the web APIs used in this demo');
    return;
  }
  
  // canvas setup
  const canvas = document.createElement("canvas");
  canvas.width = 256;
  canvas.height = 256;
  const ctx = canvas.getContext("2d");

  // recording setup
  const generator = new MediaStreamTrackGenerator({ kind: "video" });
  const writer = generator.writable.getWriter();
  const stream = new MediaStream();
  stream.addTrack(generator);
  const recorder = new MediaRecorder(stream, { mimeType: "video/webm" });
  recorder.start();

  // animate stuff
  console.log('rendering...')
  for (let i = 0; i < 246; i++) {
    ctx.fillStyle = "grey";
    ctx.fillRect(0, 0, canvas.width, canvas.height);
    ctx.fillStyle = "red";
    ctx.fillRect(i, i, 10, 10);

    const frame = new VideoFrame(canvas, { timestamp: i / 29.97 });
    await writer.write(frame);
    await new Promise(requestAnimationFrame);
  }
  console.log('rendering done');

  // stop recording and 
  recorder.addEventListener("dataavailable", (evt) => {
    const video = document.createElement('video');
    video.src = URL.createObjectURL(evt.data);
    video.muted = true;
    video.autoplay = true;
    document.body.append(video);
  });
  recorder.stop();
})();

One thing I still fail to understand is why we need to wait for the next frame: if we don't, the generated blob is empty, and if we wait twice as long, the generated video is twice as slow. Maybe MediaRecorder is supposed to only work in real time, maybe it's a chromium bug.

Thanks Nino. this is actually exactly what I am currently doing. I thought, however, using a video encoder might be faster than this way, because for some reason I remember this didn't work inside a web worker. — Amit, Apr 18 '22 at 16:55
Canvas don't fully work on web workers but [offscreen canvas](https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas) do — Nino Filiu, Apr 19 '22 at 08:53
Hey! The example doesn't really work, at least not in the SO embed nor in my personal project. One thing that stood out was that you're passing seconds to `timestamp`, but timestamp actually wants microseconds (according to MDN). — DavidsKanal, Oct 14 '22 at 15:18

WebCodecs > VideoEncoder: Create video from encoded frames

Minimal Reproduction

3 Answers3

Update (2023-04-13):

Update (2022-11-10):