3

I am using the flask-ask framework in Python to develop an Alexa-Skill which shall play just the audio-stream of an YouTube video without downloading it.

Therefore I used pafy

Following function gets the audio-url out of the YouTube video:

def get_audio():
    video = pafy.new("https://www.youtube.com/watch?v=ALdKl2HRSoI")
    bestaudio = video.getbestaudio(preftype="m4a")
    playurl = bestaudio.url
    return playurl

This function than should play the audio

@ask.intent('StoryIntent')
def story():
    speech = "I am starting to play the story"
    stream_url = get_audio("https://www.youtube.com/watch?v=ALdKl2HRSoI")
    return audio(speech).play(stream_url)

Unfortunately it isn't working with the url I get from pafy. Alexa is just playing nothing.

I printed the url which I get from pafy, and it looks like this:

Pafy: Youtube-Audio URL

If I use following URL for stream_url, everything works:

Sample Audio-File on S3

What I also tried: I downloaded the audio-file from the "Pafy: Youtube-Audio URL" and uploaded it to S3 and used the link from S3 and everything worked fine. But I don't want to download the YouTube Video for my approach.

  • I just realized that the Pafy: Youtube-Audio URL is a video-format. Only when I download the file with pafy I get an audio format. Is there a way to get the url to the audio-file without using the Stream.download() function? –  Jul 09 '18 at 14:33

1 Answers1

2

I know this question is quite old, but I have tried something similar over the last few days and actually got quite frustrated as it took me a long while figure out the solution. So maybe I can spare someone some hassle.

Similarly to the TO I tried to create an Alexa Skill to play Youtube Audio. This can be done with pafy and as pointed out in the question by using bestaudio = video.getbestaudio(). As we do not intend to download the audio it is not necessary to specify the format. The resulting link will be of the .webm format, but only with the audio part, which can be tested by opening the link in a browser.

Here is a code example for this part:

import pafy

def get_audio(url):
    video = pafy.new(url)
    best = video.getbestaudio()
    return best.url

yt_url = "https://www.youtube.com/watch?v=iu7289s7l64"
audio_url = get_audio(yt_url)
print(audio_url)

This actually runs fine for me locally. However, when deploying the code to AWS Lambda it wouldn't work. Alexa would report a problem with the requested skill response, and CloudWatch didn't show any error or log output.

There are two things to take care of here:

  1. Ensure that the library is packaged and uploaded. This however, will throw an error in the CloudWatch Logs, and is no difficult issue when using the new ASK CLI 2.x when pafy and youtube-dl is added in the requirement.txt. When using an older version of ASK CLI (1.x), I had problems with that, as the requirement.txt was ignored.
  2. Increase the Timeout of the Lambda function in the AWS Console. By default it is set to 3s, which is way to short in this case. The calls for long videos seem to take up to 10s, to be safe I set it to 30s. This is way longer than what the call needs locally for some reason (if anyone has insights on this, I'd be curious). However, this is what got me stuck.

As the TO also asks of how to invoke the audio stream, here is my solution to this:

class StoryIntent(AbstractRequestHandler):

    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return ask_utils.is_intent_name("StartPlayingIntent")(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        yt_url = "https://www.youtube.com/watch?v=iu7289s7l64"
        url = get_audio(yt_url)
        return handler_input.response_builder.speak('I am starting to play the story').add_directive(
            PlayDirective(
                play_behavior=PlayBehavior.REPLACE_ALL,
                audio_item=AudioItem(
                    stream=Stream(
                        token=url,
                        url=url,
                        expected_previous_token=None),
                    metadata=None
                )
            )
        ).set_should_end_session(True)

I'm not sure if at the time when this question was answered the code above was working. By today class-based intents are common, so here we go.

tl:dr Increase the Timeout of the Lambda Function, the default is way to short.

forin87
  • 23
  • 2
  • 8