17

I have a server application which renders a 30 FPS video stream then encodes and muxes it in real-time into a WebM Byte Stream.

On the client side, an HTML5 page opens a WebSocket to the server, which starts generating the stream when connection is accepted. After the header is delivered, each subsequent WebSocket frame consists of a single WebM SimpleBlock. A keyframe occurs every 15 frames and when this happens a new Cluster is started.

The client also creates a MediaSource, and on receiving a frame from the WS, appends the content to its active buffer. The <video> starts playback immediately after the first frame is appended.

Everything works reasonably well. My only issue is that the network jitter causes the playback position to drift from the actual time after a while. My current solution is to hook into the updateend event, check the difference between the video.currentTime and the timecode on the incoming Cluster and manually update the currentTime if it falls outside an acceptable range. Unfortunately, this causes a noticeable pause and jump in the playback which is rather unpleasant.

The solution also feels a bit odd: I know exactly where the latest keyframe is, yet I have to convert it into a whole second (as per the W3C spec) before I can pass it into currentTime, where the browser presumably has to then go around and find the nearest keyframe.

My question is this: is there a way to tell the Media Element to always seek to the latest keyframe available, or keep the playback time synchronised with the system clock time?

Saran Tunyasuvunakool
  • 1,064
  • 1
  • 9
  • 23
  • Have you found a solution? – Keyne Viana Dec 15 '17 at 16:25
  • I've realized that your solution to change currentTime is the best, the glitches that it causes is acceptable because before that a network glitch has caused the buffer to increase, so another glitch to fix it is normal... it will only remain with glitches if there are consecutive connection problems during the stream. – Keyne Viana Dec 21 '17 at 18:36
  • Is there any way to convert WebM SimpleBlock to standalone webm file with header ? – Suman Bogati Jun 05 '20 at 14:36

1 Answers1

4

network jitter causes the playback position to drift

That's not your problem. If you are experiencing drop-outs in the stream, you aren't buffering enough before playback to begin with, and playback just has an appropriately sized buffer, even if a few seconds behind realtime (which is normal).

My current solution is to hook into the updateend event, check the difference between the video.currentTime and the timecode on the incoming Cluster

That's close to the correct method. I suggest you ignore the timecode of incoming cluster and instead inspect your buffered time ranges. What you've received on the WebM cluster, and what's been decoded are two different things.

Unfortunately, this causes a noticeable pause and jump in the playback which is rather unpleasant.

How else would you do it? You can either jump to realtime, or you can increase playback speed to catch up to realtime. Either way, if you want to catch up to realtime, you have to skip in time to do that.

The solution also feels a bit odd: I know exactly where the latest keyframe is

You may, but the player doesn't until that media is decoded. In any case, keyframe is irrelevant... you can seek to non-keyframe locations. The browser will decode ahead of P/B-frames as required.

I have to convert it into a whole second (as per the W3C spec) before I can pass it into currentTime

That's totally false. The currentTime is specified as a double. https://www.w3.org/TR/2011/WD-html5-20110113/video.html#dom-media-currenttime

My question is this: is there a way to tell the Media Element to always seek to the latest keyframe available, or keep the playback time synchronised with the system clock time?

It's going to play the last buffer automatically. You don't need to do anything. You're doing your job by ensuring media data lands in the buffer and setting playback as close to that as reasonable. You can always advance it forward if a network condition changes that allows you to do this, but frankly it sounds as if you just have broken code and a broken buffering strategy. Otherwise, playback would be simply smooth.

Catching up if fallen behind is not going to happen automatically, and nor should it. If the player pauses due to the buffer being drained, a buffer needs to be built back up again before playback can resume. That's the whole point of the buffer.

Furthermore, your expectation of keeping anything in-time with the system clock is not a good idea and is unreasonable. Different devices have different refresh rates, will handle video at different rates. Just hit play and let it play. If you end up being several seconds off, go ahead and set currentTime, but be very confident of what you've buffered before doing so.

Brad
  • 159,648
  • 54
  • 349
  • 530
  • Not sure if you got the problem, but if you deliver a mp3 file for example from node with sockets in packets of 2.7 seconds, for example, the player starts like it should, but if you pause it manually the buffer will start to increase and when you play again it will start from where you stopped it. That causes the playback to be out of sync with the nodejs stream. That's what happens if you let the playback for several hours, it pauses automatically due to glitches and the buffer starts to increase. – Keyne Viana Dec 16 '17 at 00:58
  • The nodejs sends 2.7 seconds, wait it to play and then send the next packet. But with the buffer increasing you can no longer control whether the streamed data has been played or not. I've tried to send an "acknowledgment" telling the packet was played so that the next can be delivered but got other problems. Like the playback stopping. So the question is to stream data and have it played immediately continuously, for hours. – Keyne Viana Dec 16 '17 at 00:59
  • @KeyneViana Why would you wait to send the next chunk of data? Just keep sending it. Furthermore, you shouldn't be waiting 2.7 seconds... MP3 frames can be as small as 576 samples. – Brad Dec 16 '17 at 03:05
  • Because it's live, the transmission should be in sync in different browsers. If I send only the frames that will be played, anyone that starts to listen the transmission will get at the point it is, for example 00:56. If you deliver all frames without waiting, you can't control this aspect. – Keyne Viana Dec 16 '17 at 22:41
  • @KeyneViana No. Precisely because it's live, you should continue streaming data. You can't send a chunk, wait for it to play, get acknowledgement of this, and then send the next chunk. Data can't be sent, parsed, and decoded instantaneously. The internet is not a circuit-switched network. You *must* continually send data if you expect it to be continuously played. For any folks that connect mid-stream, you simply send data from that point forward. – Brad Dec 21 '17 at 17:14
  • I see your point, but in my case I don't have a unique stream for each client, my nodejs server has one stream which is transmitted to all connected sockets, this stream is also piped to liquidsoap, the socket is the audio return of what is being streamed and the liquidsoap pipe controls the back-pressure of the stream, which sends data continuously but with a delay of +-2.7 seconds between each frame, which works perfectly. The problem described by the OP can actually be solved just by adjusting currentTime like he said (I've done it now after a few tests). – Keyne Viana Dec 21 '17 at 18:43
  • The problem the OP have pointed out occurs only if the internet connection keeps slow or unreachable during the entire playback. – Keyne Viana Dec 21 '17 at 18:44
  • @KeyneViana If you're piping the stream, then you *are* streaming, and there's no issue other than that your source is sending 2.7 chunks. In these cases I would strongly recommend keeping the last chunk buffered server-side, and then flushing the entire buffer when a new client connects. This enables a fast start for the client while ensuring they have a full 2.7-second buffer to playback just in time for the next chunk to come in. – Brad Dec 21 '17 at 18:54
  • You seem to be far ahead in the audio concepts so sometimes it's hard to understand your point, I'm still a layman when it comes to audio :). This specific case of 2.7 is handled by a library called nodeshout (which is a wrapper for libshout). So far, I would say that I'm not facing any issues with this architecture as described. I simply pipe the readable stream and the back-pressure controls both socket (which is handled on.('data'...) and streaming to liquidsoap (with pipe()). – Keyne Viana Dec 21 '17 at 19:02
  • I am trying to implement "seek" using `MediaSource` ([see](https://stackoverflow.com/questions/64087720/currenttime-set-to-different-value-after-loadmetadata-event-during-seek)) but `currentTime` gets messed up in doing so. You say "_go ahead and set currentTime, but be very confident of what you've buffered before doing so_" - but how can we start playing an audio file from the middle and at the same time get the correct `currentTime`? – Stefan Falk Sep 27 '20 at 16:03