Cracks in webaudio playback during streaming of raw audio data

Question

I have a server sending chunks of raw audio over a websocket. The idea is to retrieve those and play them in a way to have the smoothest playback possible.

Here is the most important piece of code:

ws.onmessage = function (event) {
    var view = new Int16Array(event.data);
    var viewf = new Float32Array(view.length);

    audioBuffer = audioCtx.createBuffer(1, viewf.length, 22050);
    audioBuffer.getChannelData(0).set(viewf);
    source = audioCtx.createBufferSource();
    source.buffer = audioBuffer;
    source.connect(audioCtx.destination);
    source.start(0);
};

This works decently well, but there are some cracks in the playback: the network latency is not always constant, so the newest chunk of data doesn't arrive exactly at the end of the previous one being played, so I can end up with either two buffers playing together for a short amount of time or none playing at all.

I tried:

to hook the source.onended on playing the next one but it's not seamless: there is a crack at the end of every chunk and each seam is accumulating overall so the playback is getting more and more late compared to the stream.
to append the new data to the currently playing buffer, but this seem to be forbidden: buffers are of fixed size.

Is there a proper solution to fix that playback? The only requirement is to play the uncompressed audio coming from a websocket.

EDIT: Solution: Given I know my buffers lengths, I can schedule the playback this way:

if(nextStartTime == 0) nextStartTime = audioCtx.currentTime + (audioBuffer.length / audioBuffer.sampleRate)/2;
source.start(nextStartTime);
nextStartTime += audioBuffer.length / audioBuffer.sampleRate;

The first time, I schedule the beginning of the playback to half-a-buffer-later to allow that maximum unexpected latency. Then, I store the next buffer start time at the very end of my buffer end.

score 6 · Accepted Answer · answered Apr 12 '17 at 22:35

6

You should probably start with https://www.html5rocks.com/en/tutorials/audio/scheduling/ which explains very well how to schedule things in WebAudio.

For your use case, you should also take advantage of the fact that you know the sample rate of the PCM samples and you know how many sample you've read. This determines how long it will take to play out the buffer. Use that to figure out when to schedule the next buffer.

(But note that if the PCM sample rate is not the same as audioCtx.sampleRate, the data will be resampled, which might mess up your timing.

answered Apr 12 '17 at 22:35

Raymond Toy

5,490
10
13

Thanks, this in what I did, rather than simply "play right now". I have now something which sounds way better! – Ploppe Apr 30 '17 at 13:05
The AudioBuffer already has a [duration property](https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer#Properties) you can use instead of calculating it yourself. – daz Oct 31 '18 at 08:41
Hello Raymond could you explain a bit with code how you preform that? or @Ploppe maybe you have some code? Best. – JSmith Feb 11 '23 at 19:40
Do you mean to schedule the playback of multiple buffers? Doesn't the Edit in the OP explain it? – Raymond Toy Feb 11 '23 at 20:23

score 2 · Answer 2 · answered Apr 14 '17 at 00:34

2

There's a better way to handle this these days... consider using the Media Source Extensions.

Instead of having to schedule buffers and do it all yourself, you basically dump your received data into a buffer and let the browser worry about buffered playback, as if it were a file over HTTP.

Chrome supports playback of WAV files. Since your data is in raw PCM, you'll need to spoof a WAV file header. Fortunately, this isn't too difficult: http://soundfile.sapp.org/doc/WaveFormat/

answered Apr 14 '17 at 00:34

Brad

159,648
54
349
530

1

Sadly, Media Source Extensions doesn't support WAV currently.. :( https://github.com/w3c/media-source/issues/55 – bertrandg May 30 '17 at 09:35
1

Hi Brad, do you have any links showing how to use MediaSource for decoding the mp3 data chunks coming from socket? – Keyne Viana Dec 01 '17 at 19:56
Update: Still not supported. `MediaSource.isTypeSupported('audio/wav') == false` https://developer.mozilla.org/en-US/docs/Web/API/MediaSource/isTypeSupported https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types – Andrew Jul 13 '20 at 04:09
@Andrew I thought for sure I've done this before, but now I'm kicking myself for not linking to the relevant codec. If I remember correctly, the codec needed to be specified. Something like `MediaSource.isTypeSupported('audio/wav; codec=1')`. But, that isn't working, and in Googling I found a conflicting answer I wrote, ha. https://stackoverflow.com/a/57119057/362536 So, I'll delete this answer here saying it's possible, because it's either always been wrong and I was confused, or it's wrong now. Sorry for any confusion! – Brad Jul 13 '20 at 04:19
I would not delete this answer. – Andrew Jul 13 '20 at 04:23
Do you happen to know if any of the other audio codecs that are supported are simple to spoof? – Andrew Jul 13 '20 at 04:24
@Andrew Where's the audio coming from? MSE supports Opus in WebM and what not, if that's doable for you. Can you post a new question and add the details of what you're trying to do, specifically? – Brad Jul 13 '20 at 04:33
Irrelevant. I'm asking about spoofing an audio codec, like your answer suggests; the nature of the audio thus has no pertinence. – Andrew Jul 13 '20 at 05:48
@Andrew What specifically are you spoofing? – Brad Jul 13 '20 at 13:03
From your answer: "Since your data is in raw PCM, you'll need to spoof a WAV file header." – Andrew Jul 13 '20 at 15:08
@Andrew The codec is PCM. The WAV file is just the wrapper. If your audio is already in PCM, it's trivial to output the few bytes needed for the WAV file. http://soundfile.sapp.org/doc/WaveFormat/ At least in Chrome, PCM doesn't seem to be supported in any other container either, so you'll need a different codec. How you get the data encoded depends on where the audio is from, so yes, it's pertinent. Post a new question. – Brad Jul 13 '20 at 15:35
One issue with MediaSourceExtensions: you can't start them with sample accuracy. If this is important to you, then you'll need to do something else. But if you it's ok for the first buffer to start a bit off, then I'd go this route. – Raymond Toy Sep 14 '20 at 17:08
@RaymondToy Do you know if WAV (or PCM in WebM/MKV, or even PCM in MOV/MP4/ISOBMFF) is going to be supported in MSE someday? – Brad Sep 14 '20 at 17:10
I don't know. Your best bet is to file an issue requesting such support. I would certainly like some kind of lossless format that supports floating-point like WAV or FLAC. – Raymond Toy Sep 14 '20 at 20:44

Andrew · Answer 3 · 2020-07-13T06:38:55.953

I resolved this problem in my answer here. Please see it for more information and lots of code.

A quick summary:

You have to create a new ~~AudioBuffer and~~ AudioBufferSourceNode both (or at least the latter) for every piece of data that you want to buffer... I tried looping the same AudioBuffer, but once you set .audioBuffer on the AudioContext, any modifications you make to the AudioBuffer become irrelevant.

(NOTE: These classes have base/parent classes you should look at as well (referenced in the docs).)

Cracks in webaudio playback during streaming of raw audio data

3 Answers3

Linked