Can an Alexa custom skill get access to the voice stream/audio file of a user?

Question

I would like to have a custom skill, but it would need direct access to the users voice (our output of a recorded audio). Can/will Alexa relay the stream rather than sending the request invocations (launch/intent/session-end)?

I understand custom skills can send back mp3s as responses, but being able to gain access to the actual voice requests, either the stream or a mp3, would be awesome.

Edit:

It seems that there is not a provided mp3 in the request object: https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/alexa-skills-kit-interface-reference#LaunchRequest

score 15 · Answer 1 · answered Jun 01 '16 at 20:57

15

Alexa does not provide this service.

Having an always-on device in a domestic setting, that can hear everything said, plus background noise, and side conversations, is a huge security concern. Amazon mitigates this concern by filtering the input, performing the difficult Speech-to-text work, and only providing the resulting text. (After further processing by your interaction model.)

answered Jun 01 '16 at 20:57

Joseph Jaquinta

2,118
17
15

2

Surely, but my question wasn't about gaining access to the constant stream. Only when Alexa initiates and forwards a stream/audio to me once a user activates my skill. – Michael Ramos Jun 01 '16 at 21:14
2

Same privacy concerns. Same answer. – Joseph Jaquinta Jun 02 '16 at 11:46
1

Disagreed, and your answer is opinionated without much technical insight. – Michael Ramos Jun 02 '16 at 12:53
4

Actually, what Joseph told you is fact. Just because you wish it wasn't so doesn't make it opinionated. Michael-R - Joseph is one of the most prolific helpers on the Amazon Alexa forums, has published complicated skills, and written a book on Alexa programming. Please think twice before dismissing his technical insights. – John Wheeler Jun 03 '16 at 01:31
1

In the video at the bottom of the page [here](https://www.pindrop.com/resources/video/video/alexa-steal-mans-bitcoin/) it seems like it is possible if you hook up the alexa device to a custom server. Any idea how they are doing this? – Corey Cole Apr 15 '19 at 21:58
@CoreyCole - I agree - in the video it seems like it is possible. Have you found out more about it? – RaideR May 08 '20 at 09:07
1

@RaideR - that video's a mock-up of what "could" be done. For the above cited security reasons, Amazon is never going to allow 3rd parties access to raw audio. If these people got their venture capital, they would have to be come internal co-developers with Amazon to get that level of access. It is not available through the public SDK. – Joseph Jaquinta May 08 '20 at 14:26

score -1 · Answer 2 · answered May 31 '16 at 21:31

In short, no - I can't find anywhere specifically in the documentation but I just created a Python library that encapsulates all the JSON structures, so I know you can't do this yet.

The only control over audio is 'output' through embedding links in SSML.

https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/handling-requests-sent-by-alexa#Including%20Pre-Recorded%20Audio%20in%20your%20Response

Can an Alexa custom skill get access to the voice stream/audio file of a user?

2 Answers2