I know this question is quite old, but I have tried something similar over the last few days and actually got quite frustrated as it took me a long while figure out the solution. So maybe I can spare someone some hassle.
Similarly to the TO I tried to create an Alexa Skill to play Youtube Audio. This can be done with pafy and as pointed out in the question by using bestaudio = video.getbestaudio()
. As we do not intend to download the audio it is not necessary to specify the format. The resulting link will be of the .webm format, but only with the audio part, which can be tested by opening the link in a browser.
Here is a code example for this part:
import pafy
def get_audio(url):
video = pafy.new(url)
best = video.getbestaudio()
return best.url
yt_url = "https://www.youtube.com/watch?v=iu7289s7l64"
audio_url = get_audio(yt_url)
print(audio_url)
This actually runs fine for me locally. However, when deploying the code to AWS Lambda it wouldn't work. Alexa would report a problem with the requested skill response, and CloudWatch didn't show any error or log output.
There are two things to take care of here:
- Ensure that the library is packaged and uploaded. This however, will throw an error in the CloudWatch Logs, and is no difficult issue when using the new ASK CLI 2.x when pafy and youtube-dl is added in the requirement.txt. When using an older version of ASK CLI (1.x), I had problems with that, as the requirement.txt was ignored.
- Increase the Timeout of the Lambda function in the AWS Console. By default it is set to 3s, which is way to short in this case. The calls for long videos seem to take up to 10s, to be safe I set it to 30s. This is way longer than what the call needs locally for some reason (if anyone has insights on this, I'd be curious). However, this is what got me stuck.
As the TO also asks of how to invoke the audio stream, here is my solution to this:
class StoryIntent(AbstractRequestHandler):
def can_handle(self, handler_input):
# type: (HandlerInput) -> bool
return ask_utils.is_intent_name("StartPlayingIntent")(handler_input)
def handle(self, handler_input):
# type: (HandlerInput) -> Response
yt_url = "https://www.youtube.com/watch?v=iu7289s7l64"
url = get_audio(yt_url)
return handler_input.response_builder.speak('I am starting to play the story').add_directive(
PlayDirective(
play_behavior=PlayBehavior.REPLACE_ALL,
audio_item=AudioItem(
stream=Stream(
token=url,
url=url,
expected_previous_token=None),
metadata=None
)
)
).set_should_end_session(True)
I'm not sure if at the time when this question was answered the code above was working. By today class-based intents are common, so here we go.
tl:dr Increase the Timeout of the Lambda Function, the default is way to short.