Why is it more suitable to use a MediaElementAudioSourceNode for longer sounds?

Question

Complete question: Why is it more suitable to use a MediaElementAudioSourceNode rather than an AudioBuffer for longer sounds?

From MDN:

Objects of these types are designed to hold small audio snippets, typically less than 45 s. For longer sounds, objects implementing the MediaElementAudioSourceNode are more suitable.

From the specification:

This interface represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved 32-bit linear floating-point PCM values with a normal range of [−1,1][−1,1], but values are not limited to this range. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the audio element and MediaElementAudioSourceNode.

What are the benefits of using a MediaElementAudioSourceNode over of an AudioBuffer?
Are there any disadvantages when using a MediaElementAudioSourceNode for short clips?

score 4 · Accepted Answer · answered Mar 01 '17 at 00:15

4

MediaElementSourceNode has the potential ability to stream - and certainly to start playing before the entire sound file has been downloaded and decoded. It also has the ability to do this without converting (likely expanding!) the sound file to 32-bit linear PCM (CD quality audio would only be 16 bits per channel) and transcoding to the output device sample rate. For example, a 1-minute podcast recorded at 16-bit, 16kHz would be just under 2 megabytes in size natively; if you're playing back on a 48kHz device (not uncommon), the transcoding to 32-bit 48kHz would mean you're using up nearly 12 megabytes as an AudioBuffer.
MediaElementSourceNode won't give you precise playback timing, or the ability to manage/playback lots of simultaneous sounds. The precision may be reasonable for your use case, but it won't be sample-accurate timing like AudioBuffer can have.

answered Mar 01 '17 at 00:15

cwilso

13,610
1
30
35

Interesting, why doesn't a `MediaElementSourceNode` need to transcode to the output device sample rate? I thought that if you played a sound that had a different sample rate than the output device, that the pitch and speed of the sample would be altered (in my experience that is the case with an `AudioBuffer`). I do need the most precise playback timing that I can get, as I'm building some sort of a DAW, so I'll stick with `AudioBuffer`s. Thanks for the help! – Maxime Dupré Mar 01 '17 at 01:01
1

It does need to transcode to the output sample rate (aka the AudioContext rate), but it can do that for a chunk of the stream at a time, not only the entire buffer. – cwilso Mar 02 '17 at 04:15
1

The AudioBuffers, if you're using decodeAudioData(), should be resampled appropriately into the right sample rate. – cwilso Mar 02 '17 at 04:16
I understand "sample accurate" from decades of working in native daws, but what scenarios would it manifest itself with MediaElementSourceNode? Say I do... ```lang-js // partial code for example brevity const track1 = createMediaElementSource(new Audio('one.wav')); const track2 = createMediaElementSource(new Audio('two.wav')); track1.connect(destination); track2.connect(destination); // audio context resumed... track1.play(); track2.play(); // is this off by N random samples/ms? Or not, just not guaranteed? ``` – user487869 Jan 06 '23 at 03:33

Why is it more suitable to use a MediaElementAudioSourceNode for longer sounds?

1 Answers1