Python newbie here. Currently writing a crawler for a lyrics website, and I'm running into this problem when trying to parse the HTML. I'm using BeautifulSoup and requests.
Code right now is (after all imports and whatnot):
import requests as r
from bs4 import BeautifulSoup as bs
def function(artist_name):
temp = "https://www.lyrics.com/lyrics/"
if ' ' in artist_name:
artist_name = artist_name.replace(' ', '%20')
page = r.get(temp + artist_name.lower()).content
soup = bs(page, 'html.parser')
return soup
When I try to test this out, I keep getting the following error:
UnicodeEncodeError: 'ascii' codec can't encode character '\xa0' in position 8767: ordinal not in range(128)
I've tried adding .encode('utf-8')
to the end of the soup
line, and it got rid of the error but wouldn't let me use any of the BeautifulSoup methods since it returns bytes.
I've taken a look at the other posts on here, and tried other solutions they've provided for similar errors. There's still a lot I have to understand about Python and Unicode, but if someone could help out and give some guidance, would be much appreciated.