How do I decode percent-encoded characters to ordinary unicode characters?
"Lech_Kaczy%C5%84ski" ⟶ "Lech_Kaczyński"
How do I decode percent-encoded characters to ordinary unicode characters?
"Lech_Kaczy%C5%84ski" ⟶ "Lech_Kaczyński"
For Python 3, using urllib.parse.unquote
:
from urllib.parse import unquote
print(unquote("Lech_Kaczy%C5%84ski"))
Output:
Lech_Kaczyński
For Python 2, using urllib.unquote
:
import urllib
urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8')
This will return a unicode string:
u'Lech_Kaczy\u0144ski'
which you can then print and process as usual. For example:
print(urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8'))
will result in
Lech_Kaczyński
This worked for me:
import urllib
print urllib.unquote('Lech_Kaczy%C5%84ski')
Prints out
Lech_Kaczyński