-1

I used python to get html page from a japanese comic site, and used regex to only extract some titles of chapters of the comics. I can get most of them correctly as it is but some of them comes in different formats.

An example is here 骸骨騎士様、只今異世界へお出掛け中_第19章

I thought i'll try to check for similar questions about this type of format but when I type this in google it's automatically converted to japanese words.

Sorry if this might be an obvious question to some of you, but I have no idea how to convert this using python. Please help me convert this.

yashas123
  • 270
  • 5
  • 23

1 Answers1

2
str = "骸骨騎士様、只今異世界へお出掛け中_第19章"
import html
print(html.unescape(str))

See Decode HTML entities in Python string? for more details.

John Keyes
  • 5,479
  • 1
  • 29
  • 48