Trying to understand the Web Audio API better. We're using it to create an AudioContext and then sending audio to be transcribed. I want to be able to determine when there is a natural pause in speech or when the user stopped speaking.
Is there some data in onaudioprocess
callback that can be accessed to determine pauses/breaks in speech?
let context = new AudioContext();
context.onstatechange = () => {};
this.setState({ context: context });
let source = context.createMediaStreamSource(stream);
let processor = context.createScriptProcessor(4096, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = (event) => {
// Do some magic here
}
I tried a solution that is suggested on this post but did not achieve the results I need. Post: HTML Audio recording until silence?
When I parse for silence as the post suggests, I get the same result - either 0
or 128
let context = new AudioContext();
let source = context.createMediaStreamSource(stream);
let processor = context.createScriptProcessor(4096, 1, 1);
source.connect(processor);
processor.connect(context.destination);
/***
* Crete analyser
*
**/
let analyser = context.createAnalyser();
analyser.smoothingTimeConstant = 0;
analyser.fftSize = 2048;
let buffLength = analyser.frequencyBinCount;
let arrayFreqDomain = new Uint8Array(buffLength);
let arrayTimeDomain = new Uint8Array(buffLength);
processor.connect(analyser);
processor.onaudioprocess = (event) => {
/**
*
* Parse live real-time buffer looking for silence
*
**/
let f, t;
analyser.getByteFrequencyData(arrayFreqDomain);
analyser.getByteTimeDomainData(arrayTimeDomain);
for (var i = 0; i < buffLength; i++) {
arrayFreqDomain[i]; <---- gives 0 value always
arrayTimeDomain[i]; <---- gives 128 value always
}
}
Looking at the documentation for the getByteFrequencyData
method I can see how it is supposed to be giving a different value (in the documentation example it will give a different barHeight), but it isn't working for me. https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode/getByteFrequencyData#Example