I have a script thats looping through a database and doing some beautifulsoup processing on the string along with replacing some text with other text, etc.
This works 100% most of the time, however some html blobs seems to contain unicode text which breaks the script with the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 112: ordinal not in range(128)
I'm not sure what to do in this case, does anyone know of a module / function to force all text in the string to be a standardized utf-8 or something?
All the html blobs in the database came from feedparser (downloading rss feeds, storing in db).