2

Python newbie here. Currently writing a crawler for a lyrics website, and I'm running into this problem when trying to parse the HTML. I'm using BeautifulSoup and requests.

Code right now is (after all imports and whatnot):

import requests as r
from bs4 import BeautifulSoup as bs

def function(artist_name):
    temp = "https://www.lyrics.com/lyrics/"
    if ' ' in artist_name:
        artist_name = artist_name.replace(' ', '%20')
    page = r.get(temp + artist_name.lower()).content
    soup = bs(page, 'html.parser')
    return soup

When I try to test this out, I keep getting the following error:

UnicodeEncodeError: 'ascii' codec can't encode character '\xa0' in position 8767: ordinal not in range(128)

I've tried adding .encode('utf-8') to the end of the soup line, and it got rid of the error but wouldn't let me use any of the BeautifulSoup methods since it returns bytes.

I've taken a look at the other posts on here, and tried other solutions they've provided for similar errors. There's still a lot I have to understand about Python and Unicode, but if someone could help out and give some guidance, would be much appreciated.

Martin Gergov
  • 1,556
  • 4
  • 20
  • 29
pythonNoob
  • 39
  • 1
  • Welcome to the community. Could you give a little more info. I am unable to reproduce the error. What `artist_name` are you using to produce this error? What is the function feeding into? Thanks. – Mr. Kelsey Oct 11 '18 at 00:25
  • Welcome to the community! You may take a look at this https://stackoverflow.com/questions/31137552/unicodeencodeerror-ascii-codec-cant-encode-character-at-special-name/31137935#31137935 – Proteeti Prova Dec 27 '18 at 13:31
  • You've discovered a non-utf-8 character. There are different options to deal with this ... I've used this link: https://stackoverflow.com/questions/3224268/python-unicode-encode-error – Kyle Hurst Jan 28 '20 at 22:29

0 Answers0