72

I'm working on a client-side project which lets a user supply a video file and apply basic manipulations to it. I'm trying to extract the frames from the video reliably. At the moment I have a <video> which I'm loading selected video into, and then pulling out each frame as follows:

  1. Seek to the beginning
  2. Pause the video
  3. Draw <video> to a <canvas>
  4. Capture the frame from the canvas with .toDataUrl()
  5. Seek forward by 1 / 30 seconds (1 frame).
  6. Rinse and repeat

This is a rather inefficient process, and more specifically, is proving unreliable as I'm often getting stuck frames. This seems to be from it not updating the actual <video> element before it draws to the canvas.

I'd rather not have to upload the original video to the server just to split the frames, and then download them back to the client.

Any suggestions for a better way to do this are greatly appreciated. The only caveat is that I need it to work with any format the browser supports (decoding in JS isn't a great option).

The Busy Wizard
  • 956
  • 1
  • 7
  • 11
  • instead of seeking 1/30s forward, you should attach a function on the `timeupdate` event of the video. But clearly, if you don't need to do it on client side, don't use a browser for that, ffmpeg or any video tool will be more powerfull for this. – Kaiido Sep 22 '15 at 01:17
  • @Kalido I would much rather handle it server side, but the video comes from the client, and I need the frames on the client, so if I can avoid an upload/download cycle, I'd be much better off. Also, as far as seeking, I've had issues with slower devices dropping frames because they can't handle grabbing the frame data quickly enough, though, I'm not familiar with the `timeupdate` event, so I will look into it. Also, by specifying the seek time for each frame, I can control the framerate if need be. – The Busy Wizard Sep 22 '15 at 03:58
  • Ok, so please could you clarify this point (that it's for a website, client side) in an edit to your question, that wasn't clear at all. Also, the framerate in browser is absolutely not constant, so you should not rely on it. But at each new frame painted, the timeupdate event should trigger, so I believe it will be more reliable. – Kaiido Sep 22 '15 at 04:03

3 Answers3

80

[2021 update]: Since this question (and answer) has first been posted, things have evolved in this area, and it is finally time to make an update; the method that was exposed here went out-of-date, but luckily a few new or incoming APIs can help us better in extracting video frames:

The most promising and powerful one, but still under development, with a lot of restrictions: WebCodecs

This new API unleashes access to the media decoders and encoders, enabling us to access raw data from video frames (YUV planes), which may be a lot more useful for many applications than rendered frames; and for the ones who need rendered frames, the VideoFrame interface that this API exposes can be drawn directly to a <canvas> element or converted to an ImageBitmap, avoiding the slow route of the MediaElement.
However there is a catch, apart from its current low support, this API needs that the input has been demuxed already.
There are some demuxers online, for instance for MP4 videos GPAC's mp4box.js will help a lot.

A full example can be found on the proposal's repo.

The key part consists of

const decoder = new VideoDecoder({
  output: onFrame, // the callback to handle all the VideoFrame objects
  error: e => console.error(e),
});
decoder.configure(config); // depends on the input file, your demuxer should provide it
demuxer.start((chunk) => { // depends on the demuxer, but you need it to return chunks of video data
  decoder.decode(chunk); // will trigger our onFrame callback  
})

Note that we can even grab the frames of a MediaStream, thanks to MediaCapture Transform's MediaStreamTrackProcessor. This means that we should be able to combine HTMLMediaElement.captureStream() and this API in order to get our VideoFrames, without the need for a demuxer. However this is true only for a few codecs, and it means that we will extract frames at reading speed...
Anyway, here is an example working on latest Chromium based browsers, with chrome://flags/#enable-experimental-web-platform-features switched on:

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (window.MediaStreamTrackProcessor) {
    let stopped = false;
    const track = await getVideoTrack();
    const processor = new MediaStreamTrackProcessor(track);
    const reader = processor.readable.getReader();
    readChunk();

    function readChunk() {
      reader.read().then(async({ done, value }) => {
        if (value) {
          const bitmap = await createImageBitmap(value);
          const index = frames.length;
          frames.push(bitmap);
          select.append(new Option("Frame #" + (index + 1), index));
          value.close();
        }
        if (!done && !stopped) {
          readChunk();
        } else {
          select.disabled = false;
        }
      });
    }
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoTrack() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  const [track] = video.captureStream().getVideoTracks();
  video.onended = (evt) => track.stop();
  return track;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

The easiest to use, but still with relatively poor browser support, and subject to the browser dropping frames: HTMLVideoElement.requestVideoFrameCallback

This method allows us to schedule a callback to whenever a new frame will be painted on the HTMLVideoElement.
It is higher level than WebCodecs, and thus may have more latency, and moreover, with it we can only extract frames at reading speed.

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (HTMLVideoElement.prototype.requestVideoFrameCallback) {
    let stopped = false;
    const video = await getVideoElement();
    const drawingLoop = async(timestamp, frame) => {
      const bitmap = await createImageBitmap(video);
      const index = frames.length;
      frames.push(bitmap);
      select.append(new Option("Frame #" + (index + 1), index));

      if (!video.ended && !stopped) {
        video.requestVideoFrameCallback(drawingLoop);
      } else {
        select.disabled = false;
      }
    };
    // the last call to rVFC may happen before .ended is set but never resolve
    video.onended = (evt) => select.disabled = false;
    video.requestVideoFrameCallback(drawingLoop);
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoElement() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  return video;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

For your Firefox users, Mozilla's non-standard HTMLMediaElement.seekToNextFrame()

As its name implies, this will make your <video> element seek to the next frame.
Combining this with the seeked event, we can build a loop that will grab every frame of our source, faster than reading speed (yeah!).
But this method is proprietary, available only in Gecko based browsers, not on any standard tracks, and probably gonna be removed in the future when they'll implement the methods exposed above.
But for the time being, it is the best option for Firefox users:

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (HTMLMediaElement.prototype.seekToNextFrame) {
    let stopped = false;
    const video = await getVideoElement();
    const requestNextFrame = (callback) => {
      video.addEventListener("seeked", () => callback(video.currentTime), {
        once: true
      });
      video.seekToNextFrame();
    };
    const drawingLoop = async(timestamp, frame) => {
      if(video.ended) {
        select.disabled = false;
        return; // FF apparently doesn't like to create ImageBitmaps
                // from ended videos...
      }
      const bitmap = await createImageBitmap(video);
      const index = frames.length;
      frames.push(bitmap);
      select.append(new Option("Frame #" + (index + 1), index));

      if (!video.ended && !stopped) {
        requestNextFrame(drawingLoop);
      } else {
        select.disabled = false;
      }
    };
    requestNextFrame(drawingLoop);
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoElement() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  return video;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

The least reliable, that did stop working over time: HTMLVideoElement.ontimeupdate

The strategy pause - draw - play - wait for timeupdate used to be (in 2015) a quite reliable way to know when a new frame got painted to the element, but since then, browsers have put serious limitations on this event which was firing at great rate and now there isn't much information we can grab from it...

I am not sure I can still advocate for its use, I didn't check how Safari (which is currently the only one without a solution) handles this event (their handling of medias is very weird for me), and there is a good chance that a simple setTimeout(fn, 1000 / 30) loop is actually more reliable in most of the cases.

Kaiido
  • 123,334
  • 13
  • 219
  • 285
  • 2
    My assumption is not that there will be a new frame every 30th of a second, but if I get 1 frame every 1/30 seconds, I will end up with 30 FPS. The mistake I made is assuming that the frames I get will actually be the correct and current frame. I appreciate the help. The timeupdate event is very helpful. TY :) – The Busy Wizard Sep 22 '15 at 05:31
  • 2
    This method isn't exactly reliable: It once returns 4172 frames, and then 4573 on a second run, on a video which really has just 250 frames according to ffmpeg. 10 seconds @ 25 fps: http://www.w3schools.com/html/mov_bbb.mp4 – Dinesh Bolkensteyn Nov 23 '16 at 14:50
  • 1
    @DineshBolkensteyn, did you read the header of this answer and the linked answer? No this method doesn't extract reliabily all video frames that ae in the file, just the ones that has been painted by the browser, which doesn't respect video's framerate. – Kaiido Nov 23 '16 at 23:06
  • `timeupdate` fires about 5 times per second, definitely not a good way to get all the frames. – fregante Mar 22 '18 at 02:52
  • @bfred.it seems chrome has changed their code... The throttling on timeupdate should not occur on the first event, and thus we should not be concerned here since we pause/resume at every frame. (I've got about 60 frames per video's second on my FF) – Kaiido Mar 22 '18 at 05:10
  • Please note, that most browsers disabled autoplay functionality, add `video.play();` at the end for this to work. – Ben Winding Aug 22 '18 at 02:02
  • @TylerDurden This limitation should be only for videos with sound channel. Isn't the muted property enough for you? – Kaiido Aug 22 '18 at 02:04
  • Hmm right, even muted videos don't autoplay in current Chrome... weird. – Kaiido Aug 22 '18 at 02:10
  • @Kaiido this doesn't seem to output the correct frames in chrome, and the `onend` event doesn't fire. Firefox works nicely though, can you confirm? – Ben Winding Aug 22 '18 at 02:45
  • @TylerDurden indeed ended fails to fire on mine too... Sounds like bugs from their part. Easy workaround is to call onend in the drawFrame if currentTime >= duration. Though as you can see in the previous comments a lot have changed since I first wrote this answer, and it definitely requires a complete update, for which I unfortunately don't have time... – Kaiido Aug 22 '18 at 02:57
  • This doesn't work, the event timeupdate only fires every 0.25 seconds – Bill Yan Jun 26 '19 at 17:03
  • @BillYan yes this answers exactly says that there is no reliable way to do it and that the technique it uses is also broken. However a 0.25s rate is **really slow**. I didn't checked recently but you've to know that *at the time of the writing* it was the most reliable way in most mainstream browsers. But I get it did change since then. I'll have to update this answer some day – Kaiido Jun 27 '19 at 01:00
  • What can cause this answer to break with some video files? the first few frames are good, and it repeats the last frames many times – Ferrybig Jan 30 '21 at 20:11
  • What about the new function requestVideoFrameCallback? It is quite recent. Works on Chrome 83+ https://caniuse.com/?search=requestVideoFrameCallback – Bruno Marotta Feb 09 '21 at 20:11
  • Great update @Kaiido! With "reading speed" you mean "play through speed" or just "slow"? – HenrikSN Sep 09 '21 at 15:46
  • @HenrikSN yes I meant like real-time encoding of the media read at its default speed. Is "play through speed" the correct term for this? I didn't get much hits on that one. – Kaiido Sep 10 '21 at 00:41
  • 1
    @Kaiido OK, then I understand. No, I couldn't find a good term for it, so "play through speed" was just my re-wording. – HenrikSN Sep 10 '21 at 08:19
  • how many frames can we extract with the first solution? – naoval luthfi Nov 30 '21 at 14:58
  • @naovalluthfi that will depend where and how you store them. If you send them to a server or save to disk directly then the limit is the hard drive space. If you keep them as bitmap in memory then the limit is the device's RAM (I guess). If you don't save the frames, infinitely. – Kaiido Nov 30 '21 at 22:52
  • @kaiido i was trying using 5 years old code from a friend, and i ran to this problem https://stackoverflow.com/questions/70081452/out-of-memory-at-imagedata-creation-angular-when-extracting-frames-from-video?noredirect=1#comment123894889_70081452. Could u help me? – naoval luthfi Dec 01 '21 at 00:52
  • I didn't want to edit your post, but I recommend looking into this: About `HTMLMediaElement.seekToNextFrame()` and "subject to the browser dropping frames". I had a lot of dropped frames, but if you `pause()` the video after the seek and then keep seeking for new frames, then no frames will be dropped anymore. – clankill3r Dec 02 '21 at 10:55
  • @Kaiido What's there for iOS Safari, seems like it's still hard to do that. Even if with canvas method, the larger videos are still not fine. Any suggestions ? – shivamragnar Feb 18 '22 at 16:17
  • @shivamragnar unfortunately no, it doesn't help that I don't have an iOS device to test, but since the aforementioned solutions aren't available there the only one would be a setTimeout solution, which is not reliable at al... And to add to that IIRC iOS Safari doesn't let us use a lot of memory, which may be yet another limitation there... – Kaiido Feb 19 '22 at 00:59
  • @Kaiido Seems like we are helpless when it comes to iOS Safari. I will also post a solution for Safari once i come up with a good one. The good old canvas method fails when we use larger videos say more then 200 MBs or with higher quality. – shivamragnar Feb 20 '22 at 15:37
  • @SanthoshDhaipuleChandrakanth no, all these solutions require the browser to be able to decode the video input. – Kaiido Mar 23 '22 at 01:08
  • The first two examples seems to run infinitely for me, I counted 8000 frames on a 47 second file before my computer ran out of memory. This video shouldn't have no more than 47*40 = 1410 frames. And the final example is firefox-only, what a shame. The only thing that worked for me is the example below that seeks the video and writes it to a canvas, but it is excruciating slow, much slower than playing back the video in real-time. – Tenpi Mar 25 '22 at 19:05
  • @Tenpi note that the first and recommended way is https://github.com/w3c/webcodecs/tree/main/samples/mp4-decode Then, the one using a MediaStream was supposed to stop itself, I edited it to use an explicit stop but I'd be interested in seeing your code with your media where this didn't stop itself. For the requestVideoCallbackFrame I guess Chrome introduced a new bug here, but I also edited this example to avoid it. – Kaiido Mar 26 '22 at 02:31
  • I tried the first example by copying their MP4Demuxer code, and it seems to be the best by far, extracted all the frames in only a few seconds. The only problem is, how do I check when it's finished? Right now I have a setTimeout that resolves the function after 50ms, and resets whenever a new frame is processed, and it seems to work ok but I wonder if there's a better method. Also, the infinite frames problem might have something to do with React because I tried the second two examples in vanilla HTML and they worked. – Tenpi Mar 26 '22 at 22:51
  • kinda similar question here, possible help with this one? https://stackoverflow.com/questions/71781255/react-canvas-returns-blank-image – walee Apr 07 '22 at 13:15
22

Here's a working function that was tweaked from this question:

async function extractFramesFromVideo(videoUrl, fps = 25) {
  return new Promise(async (resolve) => {
    // fully download it first (no buffering):
    let videoBlob = await fetch(videoUrl).then((r) => r.blob());
    let videoObjectUrl = URL.createObjectURL(videoBlob);
    let video = document.createElement("video");

    let seekResolve;
    video.addEventListener("seeked", async function () {
      if (seekResolve) seekResolve();
    });

    video.src = videoObjectUrl;

    // workaround chromium metadata bug (https://stackoverflow.com/q/38062864/993683)
    while (
      (video.duration === Infinity || isNaN(video.duration)) &&
      video.readyState < 2
    ) {
      await new Promise((r) => setTimeout(r, 1000));
      video.currentTime = 10000000 * Math.random();
    }
    let duration = video.duration;

    let canvas = document.createElement("canvas");
    let context = canvas.getContext("2d");
    let [w, h] = [video.videoWidth, video.videoHeight];
    canvas.width = w;
    canvas.height = h;

    let frames = [];
    let interval = 1 / fps;
    let currentTime = 0;

    while (currentTime < duration) {
      video.currentTime = currentTime;
      await new Promise((r) => (seekResolve = r));

      context.drawImage(video, 0, 0, w, h);
      let base64ImageData = canvas.toDataURL();
      frames.push(base64ImageData);

      currentTime += interval;
    }
    resolve(frames);
  });
}

Usage:

let frames = await extractFramesFromVideo("https://example.com/video.webm");

Note that there's currently no easy way to determine the actual/natural frame rate of a video unless perhaps you use ffmpeg.js, but that's a 10+ megabyte javascript file (since it's an emscripten port of the actual ffmpeg library, which is obviously huge).

holem
  • 105
  • 1
  • 10
  • 2
    Interesting that I happened upon this just after your answer. I was going to give a similar (but far less detailed) answer pointing on the problems with using `ontimeupdate` and suggesting using a solution that sets `video.currentTime` instead. Excellent work! – undefined Sep 19 '18 at 21:43
  • 1
    Do we have to create new canvas every time? Since its like a temp variable, can we use just one? which is better in terms of performance? – Parthiban Rajendran Mar 16 '19 at 12:43
  • 2
    @ParthibanRajendran You can use an existing canvas and clear it before starting to draw on it. This is actually probably better for performance if it suits your needs! – Lynn Mar 18 '19 at 19:44
  • `Uncaught SyntaxError: await is only valid in async function`? referring to `let frames = await...` – conner.xyz Nov 06 '19 at 17:09
  • how many frames can we extract using this? – naoval luthfi Nov 30 '21 at 14:59
  • i was using this way only, but seems like it does not work on Safari sometimes. In my case, sometimes the `seeked` event does not fire and this is weird when it happens for a frame which is in between, but before and after frames are captured perfectly. lol – shivamragnar Dec 07 '21 at 13:35
  • @shivamragnar - did you find a solution for safari? i'm seeing a similar issue on saf ios. thanks – Cam Jun 21 '22 at 21:47
  • @Cam We just ended up using FFMPEG for more reliable results. – shivamragnar Jun 23 '22 at 16:24
  • Cool! thanks for the reply! yeah, sounds like a better solution. i'm using FFMPEG in other parts of the app - reading / drawing from the video element based on the seeked event seems too unreliable. – Cam Jun 25 '22 at 22:21
1

2023 answer:

If you want to extract all frames reliably (i.e. no "seeking" and missing frames), and do so as fast as possible (i.e. not limited by playback speed or other factors) then you probably want to use the WebCodecs API. As of writing it's supported in Chrome and Edge. Other browsers will soon follow - hopefully by the end of 2023 there will be wide support.

I put together a simple library for this, but it currently only supports mp4 files. Here's an example:

<canvas id="canvasEl"></canvas>
<script type="module">
  import getVideoFrames from "https://deno.land/x/get_video_frames@v0.0.9/mod.js"

  let ctx = canvasEl.getContext("2d");

  // `getVideoFrames` requires a video URL as input.
  // If you have a file/blob instead of a videoUrl, turn it into a URL like this:
  let videoUrl = URL.createObjectURL(fileOrBlob);

  await getVideoFrames({
    videoUrl,
    onFrame(frame) {  // `frame` is a VideoFrame object: https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame
      ctx.drawImage(frame, 0, 0, canvasEl.width, canvasEl.height);
      frame.close();
    },
    onConfig(config) {
      canvasEl.width = config.codedWidth;
      canvasEl.height = config.codedHeight;
    },
  });
  
  URL.revokeObjectURL(fileOrBlob); // revoke URL to prevent memory leak
</script>

(Note that the WebCodecs API is mentioned in @Kaiido's excellent answer, but this API alone unfortunately doesn't solve the issue - the example above uses mp4box.js to handle the stuff that the WebCodecs doesn't handle. Perhaps WebCodecs will eventually support the container side of things and this answer will become mostly irrelevant, but until then I hope that this is useful.)

joe
  • 3,752
  • 1
  • 32
  • 41