I needed to extract youtube links with their names, from youtube playlists.
So I just tried to use SelectorGadget
(Chrome Extension) for extracting CSS tag, but when I'm trying to get anything about the like BeautifulSoup returns none
, I don't where am I going wrong.
below is the code I wrote:
from os import sys
import requests
from bs4 import BeautifulSoup
import re
try:
# checking url format
url_pattern = re.compile("^(?:http|https|ftp):\/\/[a-zA-Z0-9_~:\-\/?#[\]@!$&'()*+,;=`^.%]+\.[a-zA-Z0-9_~:\-\/?#[\]@!$&'()*+,;=`^.%]+$")
# playlist_url = input("Enter your youtbe playlist url: ")
# getting input directly from user commandline
playlist_url = sys.argv[1]
if not bool(url_pattern.match(playlist_url)) :
raise ValueError("Enter valid link")
get_links_from_youtube_playlist(playlist_url)
except ValueError as value_error:
print(value_error)
then I will pass the URL to another function:
def get_links_from_youtube_playlist(youtube_playlist_url):
request_response = requests.get(youtube_playlist_url)
# using "html.parser" lib
# soup_object = BeautifulSoup(request_response.text, 'html.parser')
# using "lxml" - Processing XML and HTML with Python
soup_object = BeautifulSoup(request_response.text, 'lxml')
# not working?!
url_list = soup_object.select("#video-title")
print(url_list)
# this is not working too?!
div_content = soup_object.find("div", attrs={"class" : "content"})
print(div_content)
Also, I run it via below command:
python3 test.py https://www.youtube.com/playlist\?list\=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
My output is None when printing the BeautifulSoup object after either select
or fenter code here
ind methods. Shouldn't it find anything meaningful because the id is present in the page?
selector gadget shows me #video-title
only when clicking on that section, even I could not access the div
how should I extract link and link's name?