0

I am trying scraping and meet an issue about the words shows as ''and '', i serach the whole network but there's no answer about how to decode it, so I come to here to ask for help, is there's any way to decode it?

M_Sea
  • 389
  • 4
  • 13

1 Answers1

1

These words called "html entities". Searching use this name, you can find many methods to parse them in python. (Decode HTML entities in Python string?)

import html
print(html.unescape(''))

P.S. Unicode code point U+E091 and U+E3C4 are in Private Use Area of Unicode, these don't have any meaning unless someone defines it (e.g. webfonts).

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Coxxs
  • 460
  • 4
  • 9