9

I am taking a MediaStream and merging two separate tracks (video and audio) using a canvas and the WebAudio API. The MediaStream itself does not seem to fall out of sync, but after reading it into a MediaRecorder and buffering it into a video element the audio will always seem to play much earlier than the video Here's the code that seems to have the issue:

let stream = new MediaStream();

// Get the mixed sources drawn to the canvas
this.canvas.captureStream().getVideoTracks().forEach(track => {
  stream.addTrack(track);
});

// Add mixed audio tracks to the stream
// https://stackoverflow.com/questions/42138545/webrtc-mix-local-and-remote-audio-steams-and-record
this.audioMixer.dest.stream.getAudioTracks().forEach(track => {
  stream.addTrack(track);
});

// stream = stream;
let mediaRecorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=opus,vp8' });

let mediaSource = new MediaSource();
let video = document.createElement('video');
video.src = URL.createObjectURL(mediaSource);
document.body.appendChild(video);
video.controls = true;
video.autoplay = true;

// Source open
mediaSource.onsourceopen = () => {
  let sourceBuffer = mediaSource.addSourceBuffer(mediaRecorder.mimeType);

  mediaRecorder.ondataavailable = (event) => {

    if (event.data.size > 0) {
      const reader = new FileReader();
      reader.readAsArrayBuffer(event.data);
      reader.onloadend = () => {
        sourceBuffer.appendBuffer(reader.result);
        console.log(mediaSource.sourceBuffers);
        console.log(event.data);
      }
    }
  }
  mediaRecorder.start(1000);
}

AudioMixer.js

export default class AudioMixer {

  constructor() {
    // Initialize an audio context
    this.audioContext = new AudioContext();

    // Destination outputs one track of mixed audio
    this.dest = this.audioContext.createMediaStreamDestination();

    // Array of current streams in mixer
    this.sources = [];
  }

  // Add an audio stream to the mixer
  addStream(id, stream) {
    // Get the audio tracks from the stream and add them to the mixer
    let sources = stream.getAudioTracks().map(track => this.audioContext.createMediaStreamSource(new MediaStream([track])));
    sources.forEach(source => {

      // Add it to the current sources being mixed
      this.sources.push(source);
      source.connect(this.dest);

      // Connect to analyser to update volume slider
      let analyser = this.audioContext.createAnalyser();
      source.connect(analyser);
      ...
    });
  }

  // Remove all current sources from the mixer
  flushAll() {
    this.sources.forEach(source => {
      source.disconnect(this.dest);
    });

    this.sources = [];
  }

  // Clean up the audio context for the mixer
  cleanup() {
    this.audioContext.close();
  }
}

I assume it has to do with how the data is pushed into the MediaSource buffer but I'm not sure. What am I doing that de-syncs the stream?

Jacob Greenway
  • 461
  • 2
  • 8
  • 15
  • 1
    Is it also desynced when you play the recorded file as a whole instead of playing its chunks as buffers to your MSE? – Kaiido Sep 02 '18 at 07:22
  • I don't record the stream at all, I'm just using the MediaRecorder object to read chunks to send it later. I did just now pipe it to a file and try it and it seems to still be out of sync. – Jacob Greenway Sep 02 '18 at 07:25
  • Also note this fiddle: https://jsfiddle.net/nthyfgvs/ seems to stay in sync, but this is using a single getUserMedia call, as opposed to the many spearate ones I do to populate the canvas/audioMixer. – Jacob Greenway Sep 02 '18 at 07:27
  • 1
    ;-) You are recording it. That you don't save the chunks is an other story. And now we've found that the MSE is not the problem. So keep it off the road for now, the time you find what's really causing the issue. What if you only pass the stream as a ` – Kaiido Sep 02 '18 at 07:33
  • I think I found the issue. If I add a getUserMedia call for audio before recording and add that to the audioMixer it syncs perfectly. So maybe having an empty audio track is messing with it? – Jacob Greenway Sep 02 '18 at 07:41
  • @JacobGreenway Yeah, I wouldn't even instantiate the MediaRecorder until your tracks on the MediaStream you're recording are set up. This whole area is quite buggy in browsers as it is... don't give them anything weird if you can help it. – Brad Sep 04 '18 at 00:48
  • 1
    @JacobGreenway can you please provide code snippet of your solution ? – Suman Bogati Jul 07 '20 at 10:42

3 Answers3

3

A late reply to an old post, but it might help someone ...

I had exactly the same problem: I have a video stream, which should be supplemented by an audio stream. In the audio stream short sounds (AudioBuffer) are played from time to time. The whole thing is recorded via MediaRecorder. Everything works fine on Chrome. But on Chrome for Android, all sounds were played back in quick succession. The "when" parameter for "play()" was ignored on Android. (audiocontext.currentTime continued to increase over time ... - that was not the point).

My solution is similar to Jacob's comment Sep 2 '18 at 7:41: I created and connected a sine wave oscillator with inaudible 48,000 Hz, which played permanently in the audio stream during recording. Apparently this leads to the proper time progress.

  • At Martin Luckow, as you said, I am creating sine wave oscillator something like: `var ac = new AudioContext({ sampleRate: 48000}); var osc = ac.createOscillator(); var mediaRecorder = new MediaRecorder(stream); osc.connect(ac.destination);`, still there is a problem of syncing, am I doing something wrong in above code ? Can you please add some code snippet on your answer ? – Suman Bogati Jul 07 '20 at 05:42
1

An RTP endpoint that is emitting multiple related RTP streams that require synchronization at the other endpoint(s) MUST use the same RTCP CNAME for all streams that are to be synchronized. This requires a short-term persistent RTCP CNAME that is common across several RTP streams, and potentially across several related RTP sessions. A common example of such use occurs when lip-syncing audio and video streams in a multimedia session, where a single participant has to use the same RTCP CNAME for its audio RTP session and for its video RTP session. Another example might be to synchronize the layers of a layered audio codec, where the same RTCP CNAME has to be used for each layer.

https://datatracker.ietf.org/doc/html/rfc6222#page-2

Usama
  • 159
  • 1
  • 13
0

There is a bug in Chrome, that plays buffered media stream audio with 44100KHz, even when it's encoded with 48000 (which leads to gaps and video desync). All other browsers seem to play it fine. You can choose to change codec to the one which supports 44.1KHz encoding or play a file from web link as a source (this way Chrome can play it correctly)

alemjerus
  • 8,023
  • 3
  • 32
  • 40