According to this older answer, Python 3 strings are UTF-8 compliant by default. But in my web scraper using BeautifulSoup, when I try to print or display a URL, the Japanese characters show up as '%E3%81%82' or '%E3%81%B3' instead of the actual characters.
This Japanese website is the one I'm collecting information from, more specifically the URLs that correspond with the links in the clickable letter buttons. When you hover over for example あa, your browser will show you that the link you're about to click on is https://kokugo.jitenon.jp/cat/gojuon.php?word=あ
. However, extracting the ["href"]
property of the link using BeautifulSoup, I get https://kokugo.jitenon.jp/cat/gojuon.php?word=%E3%81%82
.
Both versions link to the same web page, but for the sake of debugging, I'm wondering if it's possible to make sure the displayed string contains the actual Japanese character. If not, how can I convert the string to accommodate this purpose?