I am trying scraping and meet an issue about the words shows as ''
and ''
, i serach the whole network but there's no answer about how to decode it, so I come to here to ask for help, is there's any way to decode it?
Asked
Active
Viewed 69 times
0

M_Sea
- 389
- 4
- 13
1 Answers
1
These words called "html entities". Searching use this name, you can find many methods to parse them in python. (Decode HTML entities in Python string?)
import html
print(html.unescape(''))
P.S. Unicode code point U+E091
and U+E3C4
are in Private Use Area
of Unicode, these don't have any meaning unless someone defines it (e.g. webfonts).

John Kugelman
- 349,597
- 67
- 533
- 578

Coxxs
- 460
- 4
- 9