I have a server application which renders a 30 FPS video stream then encodes and muxes it in real-time into a WebM Byte Stream.
On the client side, an HTML5 page opens a WebSocket to the server, which starts generating the stream when connection is accepted. After the header is delivered, each subsequent WebSocket frame consists of a single WebM SimpleBlock. A keyframe occurs every 15 frames and when this happens a new Cluster is started.
The client also creates a MediaSource, and on receiving a frame from the WS, appends the content to its active buffer. The <video>
starts playback immediately after the first frame is appended.
Everything works reasonably well. My only issue is that the network jitter causes the playback position to drift from the actual time after a while. My current solution is to hook into the updateend
event, check the difference between the video.currentTime
and the timecode on the incoming Cluster and manually update the currentTime
if it falls outside an acceptable range. Unfortunately, this causes a noticeable pause and jump in the playback which is rather unpleasant.
The solution also feels a bit odd: I know exactly where the latest keyframe is, yet I have to convert it into a whole second (as per the W3C spec) before I can pass it into currentTime
, where the browser presumably has to then go around and find the nearest keyframe.
My question is this: is there a way to tell the Media Element to always seek to the latest keyframe available, or keep the playback time synchronised with the system clock time?