0

I realize that the title may be a little confusing, so let me give some context.

I'm writing an extension to do some audio manipulation with Google Meet, and, after studying its behavior, I found a weird issue that I can't seem to wrap my head around.

Google Meet seems to use three <audio> elements to play audio, with each one having their own MediaStreams. Through some testing, it seems that:

  • Muting the <audio> element stops Google Meet's audio visualizations as to who is talking.
  • Swapping the .srcObject properties of two audio elements and then calling .play() on them does not affect Google Meet's audio visualizations.

These seem to point to Google Meet connecting the source MediaStream into its audio processing graph to create the visualizations rather than capturing the <audio> element, since I can swap MediaStreams without affecting the visualizations.

However, one more thing that I noticed seem to make no sense considering the past information:

  • Adding a new MediaStreamAudioSourceNode from the .srcObject of the <audio> element and connecting it to an AnalyserNode showed that, even when I mute the <audio> element, I can still analyse the audio being played through the MediaStream.

Here's some example code and outputs done through the browser console:

ac = new AudioContext();
an = ac.createAnalyser()
sn = ac.createMediaStreamSource(document.querySelectorAll("audio")[0].srcObject)
sn.connect(an)
function analyse(aNode) {
  const ret = new Float32Array(aNode.frequencyBinCount);
  aNode.getFloatTimeDomainData(ret);
  return ret;
}

analyse(an)
// > Float32Array(1024) [ 0.342987060546875, 0.36688232421875, 0.37115478515625, 0.362457275390625, 0.35150146484375, 0.3402099609375, 0.321075439453125, 0.308746337890625, 0.29779052734375, 0.272552490234375, … ]

document.querySelectorAll("audio")[0].muted = true
analyse(an)
// > Float32Array(1024) [ -0.203582763671875, -0.258026123046875, -0.31134033203125, -0.34375, -0.372802734375, -0.396484375, -0.3919677734375, -0.36328125, -0.31689453125, -0.247650146484375, … ]

// Here, I mute the microphone on *my end* through Google Meet.

analyse(an)
// > Float32Array(1024) [ -0.000030517578125, 0, 0, -0.000030517578125, -0.000091552734375, -0.000091552734375, -0.000091552734375, -0.00006103515625, 0, 0.000030517578125, … ]
// The values here are much closer to zero.

As you can see, when the audio element is muted, the AnalyserNode can still pick up on the audio, but Meet's visualizations break. That is what I don't understand. How can that be?

How can a connected AnalyserNode not break when the <audio> element is muted, but something else is, without using .captureStream()?

Another weird thing is that it only happens on Chrome. On Firefox, Meet's visualizations work fine even if the audio element is muted. I think this might be related to a known Chrome issue where MediaStreams require a playing <audio> element to output anything to the audio graph (https://stackoverflow.com/a/55644983), but I can't see how that would affect a muted <audio> element.

Louie Torres
  • 28
  • 1
  • 6

1 Answers1

1

It's a bit confusing but the behavior of AudioElement.captureStream() is actually different from using a MediaElementAudioSourceNode.

new MediaStreamAudioSourceNode(audioContext, audioElement.captureStream());

// is not equal to

new MediaElementAudioSourceNode(audioContext, audioElement);

The stream obtained by calling AudioElement.captureStream() is not affected by any volume changes on the audio element. Calling AudioElement.captureStream() will also not change the volume of the audio element itself.

However using a MediaElementAudioSourceNode will re-route the audio of an audio element into an AudioContext. The audio will be affected by any volume changes that are made to the audio element. Which means muting the audio element will result in muting the audio that gets piped into the AudioContext.

On top of that using a MediaElementAudioSourceNode will make the audio element itself silent.

I assume Google Meet uses a MediaElementAudioSourceNode for each audio element to process the audio.

chrisguttandin
  • 7,025
  • 15
  • 21
  • "I assume Google Meet uses a MediaElementAudioSourceNode for each audio element to process the audio." I don't believe it does since doing a find-all-files on the loaded JS files yields no results for `MediaElementAudioSourceNode` or `.createMediaElementSource`. Plus, creating such a node from the browser console doesn't silence the audio element, but adding a `muted` attribute or lowering the `volume` attribute sure does. So, I'm not sure what that all means. – Louie Torres Sep 16 '21 at 12:12
  • Actually, I think I should rephrase the question into something simpler. Thanks for the info, though! – Louie Torres Sep 16 '21 at 13:37
  • Sorry for causing even more confusion. I think you're right and it has something to do with the bug in Chrome that you mentioned. As far as I know one needs to assign a remote stream to the `srcObject` of a media element in order to "activate" it. Setting the `srcObject` to null will deactivate the stream again. But I think setting it to muted doesn't change anything. – chrisguttandin Sep 18 '21 at 16:12