11

Twilio can provide call recording, but that's not real-time. Is it possible to write an app that processes the caller's audio in real-time and responds after processing the audio? I'd like to have some software "listen" to the speaker and respond programmatically.

Zach Rattner
  • 20,745
  • 9
  • 59
  • 82

3 Answers3

10

Two years later, Twilio has released the use case I was trying to do on my own. They have a real-time speech recognition service built into Programmable Voice now. It's in public beta: https://www.twilio.com/blog/2017/05/introducing-speech-recognition.html

Zach Rattner
  • 20,745
  • 9
  • 59
  • 82
  • Hey Zach, can you help me with [this](https://stackoverflow.com/questions/54157444/real-time-transcription-twilio-agent-conference) – absin Jan 12 '19 at 07:08
9

Twilio doesn't offer a way to process audio as an IVR input as far as I know. They do offer the use of number input, but that isn't as intelligent as what you are going after: https://www.twilio.com/docs/api/twiml/gather.

You can, however listen to a call that is currently in process, with a catch. It has to be setup as a conference. A conference can do anything a normal dial can do. You can turn off some of the additional features, and then you can use the twilio js library to discreetly join a conference and listen in on a call. I suppose if you were very ambitious you could use some speech to text software to do all kinds of stuff through the Twilio client.

See annyang! for some speech to text interactivity: https://www.talater.com/annyang/

Chris Jenkins
  • 719
  • 7
  • 20
1

For people still looking, Twilio now has Voice Streams that covers this use case ! It's a twiml verb that will communicate the audio through websocket to your server.

nSimonFR
  • 351
  • 2
  • 5