I am scraping text from a webpage in Python.
The text contains all kinds of special unicode chars such as hearts, smilies and other wild stuff.
By using content.encode('ascii', 'ignore')
I am able to convert everything to ASCII but that means all accented chars and mutated vowels such as 'ä' or 'ß' are gone as well.
How can leave the "normal" chars such as 'ä' or 'é' intact but can remove all the other stuff?
(I must admit I am quite a newbie in Python and I never really got behind all the magic behind character encoding).