I'm trying to parse youtube with beautifulsoup, but without luck. I've parsed many websites which all went perfect, but this ones doesn't work and gives me this error:
UnicodeEncodeError: 'charmap' codec can't encode character '\u2117' in position 135588: character maps to <undefined>
I decoded it as following:
page_soup = soup(page_html.decode("utf-8"), "html.parser")
x = page_soup.find('div',{'id':"dismissable"})
I still get the error above. but when i try this:
Code:
page_soup = soup(page_html, "html.parser").encode("utf-8")
with encoding it i'm able to print out my webpage, but when i search in it as following:
search_list = page_soup.find_all('div',{'class':"style-scope ytd-video-renderer"})
print(len(search_list))
I get the following Error:
TypeError: slice indices must be integers or None or have an __index__ method
Any advice would be welcome.
much thanks.
additionally my code:
import urllib3
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
import requests
http = urllib3.PoolManager()
set_Link = set([''])
url = 'https://www.youtube.com/results?search_query=the+lumineers+sleep+on+the+floor'
r = http.request('get',url)
page_html = r.data #html data opslaan in variabele
page_soup = soup(page_html, "html.parser").encode("utf-8")
print(page_soup)
search_list = page_soup.find_all('div',{'class':"style-scope ytd-video-renderer"})
print(len(search_list))