What are the differences between the urllib, urllib2, urllib3 and requests module?

Question

In Python, what are the differences between the urllib, urllib2, urllib3 and requests modules? Why are there three? They seem to do the same thing...

This question should be updated to clarify that `urllib` in Python 3 is yet another option, cleaned up in various ways. But thankfully the official documentation also notes that "_The Requests package is recommended for a higher-level HTTP client interface._" at [21.6. urllib.request — Extensible library for opening URLs — Python 3.6.3 documentation](https://docs.python.org/3/library/urllib.request.html) — nealmcb, Oct 15 '17 at 16:04
Saddly I didn't see any answers telling me what `urllib3` is and how `urllib3` is different from the official `urllib` module. — Rick, Mar 13 '20 at 09:05
probably worth mentioning [httpx](https://github.com/encode/httpx) — the newer requests-backwords-compatible async library. — ccpizza, Oct 19 '20 at 14:23

score 867 · Accepted Answer · edited Jul 28 '21 at 01:44

I know it's been said already, but I'd highly recommend the requests Python package.

If you've used languages other than python, you're probably thinking urllib and urllib2 are easy to use, not much code, and highly capable, that's how I used to think. But the requests package is so unbelievably useful and short that everyone should be using it.

First, it supports a fully restful API, and is as easy as:

import requests

resp = requests.get('http://www.mywebsite.com/user')
resp = requests.post('http://www.mywebsite.com/user')
resp = requests.put('http://www.mywebsite.com/user/put')
resp = requests.delete('http://www.mywebsite.com/user/delete')

Regardless of whether GET / POST, you never have to encode parameters again, it simply takes a dictionary as an argument and is good to go:

userdata = {"firstname": "John", "lastname": "Doe", "password": "jdoe123"}
resp = requests.post('http://www.mywebsite.com/user', data=userdata)

Plus it even has a built in JSON decoder (again, I know json.loads() isn't a lot more to write, but this sure is convenient):

resp.json()

Or if your response data is just text, use:

resp.text

This is just the tip of the iceberg. This is the list of features from the requests site:

International Domains and URLs
Keep-Alive & Connection Pooling
Sessions with Cookie Persistence
Browser-style SSL Verification
Basic/Digest Authentication
Elegant Key/Value Cookies
Automatic Decompression
Unicode Response Bodies
Multipart File Uploads
Connection Timeouts
.netrc support
List item
Python 2.7, 3.6—3.9
Thread-safe.

It would help to note that the Python 3 documentation has yet another distinct library `urllib` and that its documentation also officially notes that "_The Requests package is recommended for a higher-level HTTP client interface._" at [21.6. urllib.request — Extensible library for opening URLs — Python 3.6.3 documentation](https://docs.python.org/3/library/urllib.request.html), and that `urllib3` is a great library used by `requests`. — nealmcb, Oct 15 '17 at 16:11
Ok except I have the impression [request has no replacement](https://stackoverflow.com/a/28328919/673991) for `urllib.parse()` — Bob Stein, May 18 '18 at 15:50
I don't understand why this is the accepted answer. It didn't answer OP's question. — Tyler Crompton, Aug 23 '21 at 14:12
@TylerCrompton technically, yes, but the implication is that the `urllib` packages are less user-friendly — Mike B, Sep 21 '22 at 17:10
@TylerCrompton Because it answers the real question - which one to use? — Sergei, Mar 21 '23 at 21:10

score 238 · Answer 2 · edited Oct 27 '22 at 14:35

In the Python 2 standard library there were two HTTP libraries that existed side-by-side. Despite the similar name, they were unrelated: they had a different design and a different implementation.

urllib was the original Python HTTP client, added to the standard library in Python 1.2. Earlier documentation for urllib can be found in Python 1.4.
urllib2 was a more capable HTTP client, added in Python 1.6, intended as a replacement for urllib:

urllib2 - new and improved but incompatible version of urllib (still experimental).

Earlier documentation for urllib2 can be found in Python 2.1.

The Python 3 standard library has a new urllib which is a merged/refactored/rewritten version of the older modules.

urllib3 is a third-party package (i.e., not in CPython's standard library). Despite the name, it is unrelated to the standard library packages, and there is no intention to include it in the standard library in the future.

Finally, requests internally uses urllib3, but it aims for an easier-to-use API.

Great answer, now I have another reason to not use requests, and be more confident when using the new `urllib`. Is there any discussion/announcement about the change and enhancement of the new `urllib`? I didn't find any but is really interested in knowing more about it. — Reorx, Aug 11 '22 at 16:09

Crast · Answer 3 · 2011-12-12T23:45:26.050

229

urllib2 provides some extra functionality, namely the urlopen() function can allow you to specify headers (normally you'd have had to use httplib in the past, which is far more verbose.) More importantly though, urllib2 provides the Request class, which allows for a more declarative approach to doing a request:

r = Request(url='http://www.mysite.com')
r.add_header('User-Agent', 'awesome fetcher')
r.add_data(urllib.urlencode({'foo': 'bar'})
response = urlopen(r)

Note that urlencode() is only in urllib, not urllib2.

There are also handlers for implementing more advanced URL support in urllib2. The short answer is, unless you're working with legacy code, you probably want to use the URL opener from urllib2, but you still need to import into urllib for some of the utility functions.

Bonus answer With Google App Engine, you can use any of httplib, urllib or urllib2, but all of them are just wrappers for Google's URL Fetch API. That is, you are still subject to the same limitations such as ports, protocols, and the length of the response allowed. You can use the core of the libraries as you would expect for retrieving HTTP URLs, though.

edited Dec 12 '11 at 23:45

answered Jan 07 '10 at 03:43

Crast

15,996
5
45
53

1

How does somebody create a url with an encoded query string using urllib2? It's the only reason I'm using urllib and I'd like to make sure I'm doing everything the latest/greatest way. – Gattster Jan 07 '10 at 08:51
2

Like in my above example, you use `urlopen()` and `Request` from *urllib2*, and you use `urlencode()` from *urllib*. No real harm in using both libraries, as long as you make sure you use the correct urlopen. The [urllib docs][1] are clear on that using this is acecepted usage. [1]: http://docs.python.org/library/urllib2.html#urllib2.urlopen – Crast Jan 07 '10 at 14:12
I used [this](https://gist.github.com/vgoklani/1811970) gist for `urllib2.urlopen` ; contains other variations too. – Andrei-Niculae Petre Jun 30 '14 at 10:18
urllib2 does not support put or delete which is a pain – fkl Feb 18 '15 at 21:17
4

`requests` also allow custom headers: http://docs.python-requests.org/en/master/user/quickstart/#custom-headers – Omer Dagan Sep 13 '18 at 11:09

Siyaram Malav · Answer 4 · 2020-01-03T05:54:48.360

urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities.

1) urllib2 can accept a Request object to set the headers for a URL request, urllib accepts only a URL.

2) urllib provides the urlencode method which is used for the generation of GET query strings, urllib2 doesn't have such a function. This is one of the reasons why urllib is often used along with urllib2.

Requests - Requests’ is a simple, easy-to-use HTTP library written in Python.

1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib.encode() to encode the parameters before passing them.

2) It automatically decoded the response into Unicode.

3) Requests also has far more convenient error handling.If your authentication failed, urllib2 would raise a urllib2.URLError, while Requests would return a normal response object, as expected. All you have to see if the request was successful by boolean response.ok

@PirateApp [requests](https://github.com/psf/requests/) is built on top of [urllib3](https://github.com/urllib3/urllib3). I think code using urllib3 directly can be more efficient, because it lets you reuse the session, whereas requests (at least requests 2, the one everyone uses) creates one for every request, but don't quote me on that. Neither are part of the standard library ([yet](https://github.com/psf/requests/issues/2424)) — Boris Verkhovskiy, Dec 28 '19 at 22:09

score 40 · Answer 5 · answered Oct 30 '17 at 18:42

40

Just to add to the existing answers, I don't see anyone mentioning that python requests is not a native library. If you are ok with adding dependencies, then requests is fine. However, if you are trying to avoid adding dependencies, urllib is a native python library that is already available to you.

answered Oct 30 '17 at 18:42

Zeitgeist

1,382
2
16
26

3

True, if you want to avoid adding any dependencies, urllib is available. But note that even the [Python official documentation](https://docs.python.org/3/library/urllib.request.html#module-urllib.request) recommends the requests library: "The Requests package is recommended for a higher-level HTTP client interface." – hlongmore Jun 09 '20 at 18:00
5

@hlongmore Of course, most people wouldn't want to deal with low level urllib, and Requests library provides a nice level of abstraction. It's like using a pancake mix in a box versus making it from scratch. Pros and cons. – Zeitgeist Jun 10 '20 at 17:46

Arash · Answer 6 · 2017-04-26T19:50:50.470

16

One considerable difference is about porting Python2 to Python3. urllib2 does not exist for python3 and its methods ported to urllib. So you are using that heavily and want to migrate to Python3 in future, consider using urllib. However 2to3 tool will automatically do most of the work for you.

edited Apr 26 '17 at 19:50

answered Apr 27 '16 at 01:07

Arash

609
6
10

Can you please go more into detail? If urllib2 does not exist on python3 anymore, was it removed or renamed? And why did that happen, do you have any external link? – NicoHood Sep 05 '22 at 05:36

score 16 · Answer 7 · answered Jun 20 '20 at 19:30

I think all answers are pretty good. But fewer details about urllib3.urllib3 is a very powerful HTTP client for python. For installing both of the following commands will work,

`urllib3`

using pip,

pip install urllib3

or you can get the latest code from Github and install them using,

$ git clone git://github.com/urllib3/urllib3.git
$ cd urllib3
$ python setup.py install

Then you are ready to go,

Just import urllib3 using,

import urllib3

In here, Instead of creating a connection directly, You’ll need a PoolManager instance to make requests. This handles connection pooling and thread-safety for you. There is also a ProxyManager object for routing requests through an HTTP/HTTPS proxy Here you can refer to the documentation. example usage :

>>> from urllib3 import PoolManager
>>> manager = PoolManager(10)
>>> r = manager.request('GET', 'http://google.com/')
>>> r.headers['server']
'gws'
>>> r = manager.request('GET', 'http://yahoo.com/')
>>> r.headers['server']
'YTS/1.20.0'
>>> r = manager.request('POST', 'http://google.com/mail')
>>> r = manager.request('HEAD', 'http://google.com/calendar')
>>> len(manager.pools)
2
>>> conn = manager.connection_from_host('google.com')
>>> conn.num_requests
3

As mentioned in urrlib3 documentations,urllib3 brings many critical features that are missing from the Python standard libraries.

Thread safety.
Connection pooling.
Client-side SSL/TLS verification.
File uploads with multipart encoding.
Helpers for retrying requests and dealing with HTTP redirects.
Support for gzip and deflate encoding.
Proxy support for HTTP and SOCKS.
100% test coverage.

Follow the user guide for more details.

Response content (The HTTPResponse object provides status, data, and header attributes)
Using io Wrappers with Response content
Creating a query parameter
Advanced usage of urllib3

`requests`

requests uses urllib3 under the hood and make it even simpler to make requests and retrieve data. For one thing, keep-alive is 100% automatic, compared to urllib3 where it's not. It also has event hooks which call a callback function when an event is triggered, like receiving a response In requests, each request type has its own function. So instead of creating a connection or a pool, you directly GET a URL.

For install requests using pip just run

pip install requests

or you can just install from source code,

$ git clone git://github.com/psf/requests.git
$ cd requests
$ python setup.py install

Then, import requests

Here you can refer the official documentation, For some advanced usage like session object, SSL verification, and Event Hooks please refer to this url.

Thank you for this answer. I came here because I had seen `urllib3` and didn't know if I should use it or `requests`. Now I feel informed about how to make that decision going forward. The accepted answer gives a nice breakdown of `requests` but does not differentiate it from the alternatives. — causaSui, Sep 14 '20 at 20:20
If you are afflicted by a corporate proxy, know that the requests module cheerfully honors environment variables http_proxy, https_proxy, no_proxy. The urllib3 module ignores environment variables; to send your queries via a proxy you must create an instance of ProxyManager instead of PoolManager. — chrisinmtown, May 05 '21 at 15:45

score 12 · Answer 8 · answered Jan 07 '10 at 03:51

12

I like the urllib.urlencode function, and it doesn't appear to exist in urllib2.

>>> urllib.urlencode({'abc':'d f', 'def': '-!2'})
'abc=d+f&def=-%212'

answered Jan 07 '10 at 03:51

Gattster

4,613
5
27
39

5

Just a note, be careful with urlencode as it can't handle objects directly -- you have to encode them before sending them to urlencode (u'blá'.encode('utf-8'), or whatever). – Jun 27 '11 at 02:12
@user18015: I do not think this applies to Python 3, can you clarify? – Janus Troelsen Dec 17 '12 at 16:10
2

As I noted above, this question and the various answers should be updated to clarify that `urllib` in Python 3 is yet another option, cleaned up in various ways. But thankfully, the official documentation also notes that "_The Requests package is recommended for a higher-level HTTP client interface._" at [21.6. urllib.request — Extensible library for opening URLs — Python 3.6.3 documentation](https://docs.python.org/3/library/urllib.request.html) – nealmcb Oct 15 '17 at 16:06
urllib2 doesn't exist at all in Python 3 – Boris Verkhovskiy Dec 28 '19 at 22:16
It moved to urllib.parse.urlencode in Python 3. – Martijn Pieters May 21 '21 at 20:21

score 10 · Answer 9 · answered Dec 20 '17 at 02:29

To get the content of a url:

try: # Try importing requests first.
    import requests
except ImportError: 
    try: # Try importing Python3 urllib
        import urllib.request
    except AttributeError: # Now importing Python2 urllib
        import urllib


def get_content(url):
    try:  # Using requests.
        return requests.get(url).content # Returns requests.models.Response.
    except NameError:  
        try: # Using Python3 urllib.
            with urllib.request.urlopen(index_url) as response:
                return response.read() # Returns http.client.HTTPResponse.
        except AttributeError: # Using Python3 urllib.
            return urllib.urlopen(url).read() # Returns an instance.

It's hard to write Python2 and Python3 and request dependencies code for the responses because they urlopen() functions and requests.get() function return different types:

Python2 urllib.request.urlopen() returns a http.client.HTTPResponse
Python3 urllib.urlopen(url) returns an instance
Request request.get(url) returns a requests.models.Response

score 6 · Answer 10 · edited Aug 11 '10 at 12:42

6

You should generally use urllib2, since this makes things a bit easier at times by accepting Request objects and will also raise a URLException on protocol errors. With Google App Engine though, you can't use either. You have to use the URL Fetch API that Google provides in its sandboxed Python environment.

edited Aug 11 '10 at 12:42

Peter Mortensen

30,738
21
105
131

answered Jan 07 '10 at 03:36

Chinmay Kanchi

62,729
22
87
114

2

What you said about appengine is not entirely true. You can actually use httplib, urllib, and urllib2 in App Engine now (they are wrappers for url fetch, done so that more code would be compatible with appengine.) – Crast Jan 07 '10 at 03:45
Ah, must be new. My code failed last I tried and had to be rewritten to work with fetch... – Chinmay Kanchi Jan 07 '10 at 10:30
https://devsite.googleplex.com/appengine/docs/python/urlfetch/overview#Fetching_URLs_in_Python – allyourcode Apr 07 '12 at 02:29
1

urllib2 doesn't exist at all in Python 3 – Boris Verkhovskiy Dec 28 '19 at 22:17
1

@Boris It migrated to [urllib.request](https://docs.python.org/3/library/urllib.request.html) and [urllib.error](https://docs.python.org/3/library/urllib.error.html). – Alan Apr 11 '20 at 00:30

paradocslover · Answer 11 · 2019-02-17T00:50:04.810

4

A key point that I find missing in the above answers is that urllib returns an object of type <class http.client.HTTPResponse> whereas requests returns <class 'requests.models.Response'>.

Due to this, read() method can be used with urllib but not with requests.

P.S. : requests is already rich with so many methods that it hardly needs one more as read() ;>

edited Feb 17 '19 at 00:50

answered Dec 14 '18 at 00:04

paradocslover

2,932
3
18
44

What are the differences between the urllib, urllib2, urllib3 and requests module?

11 Answers11

`urllib3`

`requests`

Linked

Related