0

I have a buffer that contain the voice of the person that in media stream, i and I send it from JavaScript to NodeJS using socket.io
I need to convert that buffer to text (like speech to text, but the voice stored as buffer coming from media stream)

There is a helper function I used (in nodejs see below) that convert from/to buffer/arrayBuffer and there is a package called node-blob that convert buffer to audio blob but I search a lot how convert audio or even buffer to text, but I failed any help, code or package that may help to convert it to text ?

JavaScript

navigator.mediaDevices
  .getUserMedia({
    video: true,
    audio: true,
  })
  .then((stream) => {
    setSrcVideo(stream);

    const mediasStream = new MediaStream();
    mediasStream.addTrack(stream.getVideoTracks()[0]);
    mediasStream.addTrack(stream.getAudioTracks()[0]);

    const mediaRecorder = new MediaRecorder(mediasStream);

    socket.emit('ready');

    mediaRecorder.addEventListener('dataavailable', (event) => {
      if (event.data && event.data.size > 0) {
        socket.emit('send-chunks', event.data);
      }
    });
        
    socket.on('start-recording', () => {
      mediaRecorder.start(1000);
    });

  });

and I receive that buffer bysocket.on('send-chunks') in NodeJS like this

NodeJS

// connection to socket.io
io.on('connection', (socket) => {
   socket.on('ready', () => {
      socket.emit('start-recording');
   });

  socket.on('send-chunks', (chunks) => {
    // covert to text 
  });
});

// helper functions
const toArrayBuffer = (buffer) => {
  const arrayBuffer = new ArrayBuffer(buffer.length);
  const view = new Uint8Array(arrayBuffer);
  for (let i = 0; i < buffer.length; ++i) {
    view[i] = buffer[i];
  }
  return arrayBuffer;
};

const toBuffer = (arrayBuffer) => {
  const buffer = Buffer.alloc(arrayBuffer.byteLength);
  const view = new Uint8Array(arrayBuffer);
  for (let i = 0; i < buffer.length; ++i) {
    buffer[i] = view[i];
  }
  return buffer;
};
  • You can convert buffer to text with `buffer.toString()`. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray/toString – Dshiz Dec 11 '21 at 14:05
  • @Dshiz that not what I need, I want the voice of the person that in media stream converting to text (like speech to text, but the voice stored as buffer coming from media stream) – Ahmed El-Tabarani Dec 11 '21 at 14:24
  • Ah, that wasn't clear in the original question. You might get some mileage out of https://stackoverflow.com/questions/35643347/speech-recognition-nodejs – Dshiz Dec 11 '21 at 14:37

0 Answers0