How can I percent-encode URL parameters in Python?

Question

If I do

url = "http://example.com?p=" + urllib.quote(query)

It doesn't encode / to %2F (breaks OAuth normalization)
It doesn't handle Unicode (it throws an exception)

Is there a better library?

What is the language-agnostic canonical Stack Overflow question? (That is, only covering the encoding, not *how* it is achieved.) — Peter Mortensen, Nov 27 '22 at 21:54

score 526 · Accepted Answer · edited Jul 04 '23 at 12:37

526

From the Python 3 documentation:

urllib.parse.quote(string, safe='/', encoding=None, errors=None)

Replace special characters in string using the %xx escape. Letters, digits, and the characters '_.-~' are never quoted. By default, this function is intended for quoting the path section of a URL. The optional safe parameter specifies additional ASCII characters that should not be quoted — its default value is '/'.

That means passing '' for safe will solve your first issue:

>>> import urllib.parse
>>> urllib.parse.quote('/test')
'/test'
>>> urllib.parse.quote('/test', safe='')
'%2Ftest'

(The function quote was moved from urllib to urllib.parse in Python 3.)

By the way, have a look at urlencode.

About the second issue, there was a bug report about it and it was fixed in Python 3.

For Python 2, you can work around it by encoding as UTF-8 like this:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller

edited Jul 04 '23 at 12:37

Benjamin Loison

3,782
4
16
33

answered Nov 08 '09 at 02:52

Nadia Alramli

111,714
37
173
152

2

Thanks you, both worked great. urlencode just calls quoteplus many times in a loop, which isn't the correct normalization for my task (oauth). – Paul Tarjan Nov 08 '09 at 09:14
8

the spec: [rfc 2396](https://www.ietf.org/rfc/rfc2396.txt) defines these as reserved `reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","` Which is what urllib.quote is dealing with. – Jeff Sheffield Sep 23 '15 at 17:42
8

`urllib.parse.quote` [docs](https://docs.python.org/3/library/urllib.parse.html#url-quoting) – Andreas Haferburg Dec 16 '16 at 10:50
Also, in the case of encoding a search query, you maybe better off using quote_plus: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus 1. It encodes slashes by default 2. It also encodes spaces – Pavel Vergeev May 30 '18 at 09:50
`six.moves.urllib.parse.quote(u"Müller".encode('utf8'))` for Python 2 and 3. – Bob Stein Dec 10 '18 at 21:00
2

if you wanna retain the colon from http: , do `urllib.parse.quote('http://example.com/some path/').replace('%3A', ':')` – chrizonline May 09 '19 at 07:27
3

@chrizonline Just use `urllib.parse.quote(url, safe=':/')`. Even better, encode `some path`, then join strings. This is Python, not PHP. – Pavel Vlasov Dec 23 '21 at 09:26
safe="" is missing in Python 3 answer! – PythoNic Jun 04 '22 at 14:48

score 209 · Answer 2 · edited Nov 19 '21 at 15:48

209

In Python 3, urllib.quote has been moved to urllib.parse.quote, and it does handle Unicode by default.

>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'

edited Nov 19 '21 at 15:48

Peter Mortensen

30,738
21
105
131

answered Nov 29 '12 at 11:52

Paolo Moretti

54,162
23
101
92

2

The name `quote` is rather vague as a global. It might be nicer to use something like urlencode: `from urllib.parse import quote as urlencode`. – Luc Mar 05 '19 at 16:35
5

Note that there is a function named `urlencode` in `urllib.parse` already that does something completely different, so you'd be better off picking another name or risk seriously confusing future readers of your code. – jaymmer - Reinstate Monica Apr 02 '20 at 02:41
1

(style suggestion: @Luc i agree that `quote` is "rather vague". rather than rename the variable/object to something else you can leave the name fully qualified as `urllib.parse.quote`. leaving it fully qualified does two things: takes a little extra time typing and saves time reading and maintaining the code. ) – Trevor Boyd Smith Jan 24 '23 at 14:07

score 63 · Answer 3 · edited Nov 19 '21 at 15:59

63

I think module requests is much better. It's based on urllib3.

You can try this:

>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'

_{My answer is similar to Paolo's answer.}

edited Nov 19 '21 at 15:59

Peter Mortensen

30,738
21
105
131

answered Jul 14 '15 at 08:30

Aminah Nuraini

18,120
8
90
108

8

`requests.utils.quote` is link to python `quote`. See [request sources](https://github.com/kennethreitz/requests/blob/master/requests/compat.py#L36). – Cjkjvfnby Aug 05 '15 at 14:11
25

`requests.utils.quote` is a thin compatibility wrapper to `urllib.quote` for python 2 and `urllib.parse.quote` for python 3 – Jeff Sheffield Sep 23 '15 at 17:30
without reading the comments, this is creating confusion... – PythoNic Jun 04 '22 at 14:46

score 15 · Answer 4 · edited Nov 19 '21 at 16:23

15

If you're using Django, you can use urlquote:

>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")
u'M%C3%BCller'

Note that changes to Python mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:

A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)

edited Nov 19 '21 at 16:23

Peter Mortensen

30,738
21
105
131

answered Oct 27 '15 at 19:40

Rick Westera

3,142
1
35
23

it's deprecated from Django 3.0+ – mosi_kha Nov 27 '21 at 12:13

score 5 · Answer 5 · edited Nov 19 '21 at 15:56

It is better to use urlencode here. There isn't much difference for a single parameter, but, IMHO, it makes the code clearer. (It looks confusing to see a function quote_plus! - especially those coming from other languages.)

In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'

In [22]: val=34

In [23]: from urllib.parse import urlencode

In [24]: encoded = urlencode(dict(p=query,val=val))

In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34

Documentation

urlencode
quote_plus

score 1 · Answer 6 · answered Aug 04 '22 at 11:01

1

An alternative method using furl:

import furl

url = "https://httpbin.org/get?hello,world"
print(url)
url = furl.furl(url).url
print(url)

Output:

https://httpbin.org/get?hello,world
https://httpbin.org/get?hello%2Cworld

answered Aug 04 '22 at 11:01

BaiJiFeiLong

3,716
1
30
28

How can I percent-encode URL parameters in Python?

6 Answers6

Documentation

Linked

Related