How do I encode/decode percent-encoded (URL) strings in Python?

Question

How do I decode percent-encoded characters to ordinary unicode characters?

"Lech_Kaczy%C5%84ski"    ⟶    "Lech_Kaczyński"

Possible duplicate of [How to unquote a urlencoded unicode string in python?](http://stackoverflow.com/questions/300445/how-to-unquote-a-urlencoded-unicode-string-in-python) — Peter Wood, Oct 15 '15 at 08:36

Mateen Ulhaq · Answer 1 · 2022-03-30T20:38:45.357

27

from urllib.parse import unquote

print(unquote("Lech_Kaczy%C5%84ski"))

Output:

Lech_Kaczyński

edited Mar 30 '22 at 20:38

answered Jan 12 '20 at 06:44

Mateen Ulhaq

import error, should be `from urllib.parse import unquote` for imports to work properly – Mahmoud Elshahat Oct 02 '21 at 05:49

score 14 · Accepted Answer · edited Mar 30 '22 at 20:36

14

For Python 2, using urllib.unquote:

import urllib
urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8')

This will return a unicode string:

u'Lech_Kaczy\u0144ski'

which you can then print and process as usual. For example:

print(urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8'))

will result in

Lech_Kaczyński

edited Mar 30 '22 at 20:36

Mateen Ulhaq

answered Oct 15 '15 at 08:34

It gives me `Lech_Kaczy\xc5\x84ski`, instead of `Lech_Kaczyński` – yak Oct 15 '15 at 08:36
That doesn't look like a unicode string, are you sure you tried correctly? Here's my session: ... (I'll edit it in the post) – Matthias C. M. Troffaes Oct 15 '15 at 08:38
I'm not sure you even need the `decode` call (based only on it working when I try without). – Holloway Oct 15 '15 at 08:41
Make sure you put the decode('utf8') at the very end. I can only reproduce what you get if I do the decoding in the wrong place. – Matthias C. M. Troffaes Oct 15 '15 at 08:43
Trengot: technically it is not necessary. However, in python is is generally recommended to convert all your text in unicode as soon as possible, so you don't need to worry about encodings when you pass this to other functions. – Matthias C. M. Troffaes Oct 15 '15 at 08:45
@yak, you mut use a display method that's compatible with utf-8 if your python is expecting an ascii display it will not attempt to display non-ascii symbols. – Jasen Mar 08 '21 at 23:21

score 1 · Answer 3 · answered Dec 10 '19 at 04:30

1

This worked for me:

import urllib

print urllib.unquote('Lech_Kaczy%C5%84ski')

Prints out

Lech_Kaczyński

answered Dec 10 '19 at 04:30

answerzilla

3 Answers3