How to decode this "%E3%83%9C" string in python?

Question

So I have the following string

"%E3%83%9C%E3%83%89%E3%82%AB%E3%81%95%E3%82%93"

It actually means this

ボドカさん

This string seems to be encoded in UTF-8 because when I write this in python

encoded_str = b'\xe3\x83\x9c\xe3\x83\x89\xe3\x82\xab\xe3\x81\x95\xe3\x82\x93'
print(encoded_str)
print(encoded_str.decode('utf-8'))

Here is the output I get

b'\xe3\x83\x9c\xe3\x83\x89\xe3\x82\xab\xe3\x81\x95\xe3\x82\x93'
ボドカさん

But now I would like a script that will allow me to decode any string in the initial format and here is my code.

import re
import os

mystr = "%E3%83%9C%E3%83%89%E3%82%AB%E3%81%95%E3%82%93"
mystr = mystr.lower()
mystr = re.sub('%', r'\\x', mystr)
encoded_str = bytes(mystr, "utf-8")

print(mystr)
print(encoded_str)
print(encoded_str.decode('utf-8'))

Output:

\xe3\x83\x9c\xe3\x83\x89\xe3\x82\xab\xe3\x81\x95\xe3\x82\x93
b'\\xe3\\x83\\x9c\\xe3\\x83\\x89\\xe3\\x82\\xab\\xe3\\x81\\x95\\xe3\\x82\\x93'
\xe3\x83\x9c\xe3\x83\x89\xe3\x82\xab\xe3\x81\x95\xe3\x82\x93

I tried so many possibilities but I couldn't find the right way to encode proprely my string like the b'STRING' thing would do. I always get extra \ characters from the encoding process that then spoil the decoding process too.

I tried all the encoding methods existing in python for the bytes() function.

I need help please. Thank you. Stack overflow banned me for that question lol

The %-style encoding is known as url encoding. The linked duplicate shows how to decode it. — snakecharmerb, Sep 07 '22 at 13:21

score 1 · Answer 1 · answered Sep 07 '22 at 13:21

1

mystr = "%E3%83%9C%E3%83%89%E3%82%AB%E3%81%95%E3%82%93"
encoded_str = bytes.fromhex(mystr.replace('%', ''))
print(encoded_str.decode('utf-8'))

Output:

ボドカさん

answered Sep 07 '22 at 13:21

Crapicus

214
2
9

Thank you so much, I didn t know about the format expected to decode utf-8 strings. – Mash Sep 07 '22 at 13:23

How to decode this "%E3%83%9C" string in python?

1 Answers1