0

I use google speech-to-text service via @google-cloud/speech library on my node.js firebase functions, however I want to use it as streaming data and as far as I understand there is no possibility to achieve that on firebase functions.

So I have decided to do that on the client side which is angular for my case. Nevertheless I cannot find any useful documentation.

I want to know if there is any possibility to use google speech-to-text service on angular in a secure way.

Here is my functions on node.js;

import {SpeechClient} from "@google-cloud/speech";

    export const transcriptAudio = onCall(async (data) => {
      let userId = data.auth?.token?.uid;
      if(!userId)
        throw new HttpsError('unauthenticated', 'The user is not authenticated.');
  
      let audioByte = data.data.audioByte as string;
    
      // Detects speech in the audio file
      const client = new SpeechClient();
      const response = await client.recognize(
        {
          audio: {content: audioByte},
          config: {
            encoding: "LINEAR16",
            audioChannelCount: 2,
            languageCode: "en-US"}});

      const transcription = response[0].results?.map(result => result.alternatives![0].transcript).join('\n');
      return transcription;
});

UPDATE 1: I have asked it to chatgpt and it suggested me to use Firebase Functions to authenticate the user and get the accessToken and on angular side, use that accessToken to authenticate the api requests. It sounds reasonable to me.

UPDATE 2: It turned out that I need to use grpc to use speech-to-text as a streaming data. So I need to setup and use a ngx-grpc like library to achieve that, is that right? It seems like a lot of work. I don't want to miss a much easier solution.

mctuna
  • 809
  • 3
  • 19
  • 38

1 Answers1

1

I use google speech-to-text service via @google-cloud/speech library on my node.js firebase functions, however I want to use it as streaming data and as far as I understand there is no possibility to achieve that on firebase functions.

You can't really use stream data effectively in Cloud Functions. Cloud Functions are meant to be stateless. They serve a single request at a time, and clean up afterward. If you try to use it as a stream of data for your speech to text use case, it just won't work the way you expect. Because Cloud Functions don't keep a socket open to the requester. Once a response is sent, the connection is closed, and there's no way to keep it open. Means there will be no streaming of data.

Instead as per your assumption you will have to implement that feature within an angular client app. One example of that implementation is Creating a Speech Recognition App in Angular by Donishka Tharindu on Medium.

Rohit Kharche
  • 2,541
  • 1
  • 2
  • 13
  • Thanks for the info, I was unaware of webkitSpeechRecognition. It is a good plan b and plus it is for free. – mctuna Aug 07 '23 at 08:43
  • @mctuna Yes for your plan A to work you could explore [this implementation](https://stackoverflow.com/questions/43870277/how-to-stream-audio-from-angularjs-to-google-cloud-speech-api) as well. Glad I can help you. – Rohit Kharche Aug 07 '23 at 09:47
  • @mctuna do you required any additional help ? – Rohit Kharche Aug 11 '23 at 06:03
  • Actually, it doesn't show how to use speech to text service as streaming data on firebase functions, or am I missing something? – mctuna Aug 21 '23 at 09:01