I just built a real-time app using socket.io where a "master" user can trigger sounds on receiving devices (desktop browsers, mobile browsers). That master user sees a list of sound files, and can click "Play" on a sound file.
The audio playback is instant on browsers. On mobiles however, there is a 0.5-2 seconds delay (my Nexus 4 and iPhone 5 about 1 second and iPhone 3GS 1-2 seconds).
I've tried several things to optimize the audio playback to make it faster on mobiles. Right now (at the best "phase" of its optimization I'd say), I combine all the mp3's together in one audio file (it creates .mp3, .ogg, and .mp4 files). I need ideas on how I can further fix / improve this issue. The bottleneck really seems to be in the hmtl 5 audio methods such as .play()
.
On the receivers I use as such:
<audio id="audioFile" preload="auto">
<source src="/output.m4a" type="audio/mp4"/>
<source src="/output.mp3" type="audio/mpeg"/>
<source src="/output.ogg" type="audio/ogg"/>
<p>Your browser does not support HTML5 audio.</p>
</audio>
In my JS:
var audioFile = document.getElementById('audioFile');
// Little hack for mobile, as only a user generated click will enable us to play the sounds
$('#prepareAudioBtn').on('click', function () {
$(this).hide();
audioFile.play();
audioFile.pause();
audioFile.currentTime = 0;
});
// Master user triggered a sound sprite to play
socket.on('playAudio', function (audioClip) {
if (audioFile.paused)
audioFile.play();
audioFile.currentTime = audioClip.startTime;
// checks every 750ms to pause the clip if the endTime has been reached.
// There is a second of "silence" between each sound sprite so the pause is sure to happen at a correct time.
timeListener(audioClip.endTime);
});
function timeListener(clipEndTime) {
this.clear = function () {
clearInterval(interval);
interval = null;
};
if (interval !== null) {
this.clear();
}
interval = setInterval(function () {
if (audioFile.currentTime >= clipEndTime) {
audioFile.pause();
this.clear();
}
}, 750);
}
Also considered blob for each sound but some sounds can go for minutes so that's why I resorted to combining all sounds together for 1 big audio file (better than several audio
tags on the page for each clip)