Looking at the source of urllib2 it looks like the easiest way to do it would be to subclass HTTPRedirectHandler and then use build_opener to override the default HTTPRedirectHandler, but this seems like a lot of (relatively complicated) work to do what seems like it should be pretty simple.
Asked
Active
Viewed 1.1e+01k times
7 Answers
271
Here is the Requests way:
import requests
r = requests.get('http://github.com', allow_redirects=False)
print(r.status_code, r.headers['Location'])

Marian
- 14,759
- 6
- 32
- 44
-
7Then look at `r.headers['Location']` to see where it would have sent you – patricksurry Jan 12 '17 at 16:43
-
Note that it seems that Requests will normalize `Location` to `location`. – Hamish May 12 '17 at 01:36
-
2@Hamish `requests` allows you to access headers both in the canonical form and in lowercase. See http://docs.python-requests.org/en/master/user/quickstart/#response-headers – Marian May 12 '17 at 07:21
-
1As of 2019 in Python 3, this no longer appears to work for me. (I get a key dict error.) – Max von Hippel Aug 15 '19 at 00:19
-
1Check r.status_code if it is not 301 there might have been another error. The Location header is only available for redirects. Use dict.get if you want to avoid KeyError on optional keys. – user3504575 Jan 20 '21 at 15:00
-
TypeError: request() got an unexpected keyword argument 'max_redirects' – CS QGB May 07 '21 at 10:00
35
Dive Into Python has a good chapter on handling redirects with urllib2. Another solution is httplib.
>>> import httplib
>>> conn = httplib.HTTPConnection("www.bogosoft.com")
>>> conn.request("GET", "")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
301 Moved Permanently
>>> print r1.getheader('Location')
http://www.bogosoft.com/new/location

Fábio Batista
- 25,002
- 3
- 56
- 68

olt
- 2,267
- 1
- 19
- 13
-
8Everybody who comes here from google, please note that the up to date way to go is this one: http://stackoverflow.com/a/14678220/362951 The requests library will save you a lot of headache. – mit May 05 '14 at 02:36
-
12
This is a urllib2 handler that will not follow redirects:
class NoRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
infourl = urllib.addinfourl(fp, headers, req.get_full_url())
infourl.status = code
infourl.code = code
return infourl
http_error_300 = http_error_302
http_error_301 = http_error_302
http_error_303 = http_error_302
http_error_307 = http_error_302
opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)

Carles Barrobés
- 11,608
- 5
- 46
- 60
-
I'm unit testing an API and dealing with a login method that redirects to a page I don't care about, but doesn't send the desired session cookie with the response to the redirect. This is exactly what I needed for that. – Tim Wilder Feb 11 '14 at 23:41
9
The redirections
keyword in the httplib2
request method is a red herring. Rather than return the first request it will raise a RedirectLimit
exception if it receives a redirection status code. To return the inital response you need to set follow_redirects
to False
on the Http
object:
import httplib2
h = httplib2.Http()
h.follow_redirects = False
(response, body) = h.request("http://example.com")

Ian Mackinnon
- 13,381
- 13
- 51
- 67
8
i suppose this would help
from httplib2 import Http
def get_html(uri,num_redirections=0): # put it as 0 for not to follow redirects
conn = Http()
return conn.request(uri,redirections=num_redirections)

Ashish
- 430
- 7
- 13
6
The shortest way however is
class NoRedirect(urllib2.HTTPRedirectHandler):
def redirect_request(self, req, fp, code, msg, hdrs, newurl):
pass
noredir_opener = urllib2.build_opener(NoRedirect())

Tzury Bar Yochay
- 8,798
- 5
- 49
- 73
-
2How is this the shortest way? It doesn't even contain the import or the actual request. – Marian May 09 '13 at 18:49
-
1I already was going to post this solution and was quite surprised to find this answer at the bottom. It is very concise and should be the top answer in my opinion. – user Jan 21 '15 at 01:55
-
1Moreover, it gives you more freedom, this way it's possible to [control which URLs to follow](http://stackoverflow.com/a/28057731/3075942). – user Jan 21 '15 at 02:14
-
1I confirm, this is the easist way. A short remark for those who want to debug. Do not forget that you may set multiples handlers when bullding the opener like : `opener = urllib.request.build_opener(debugHandler, NoRedirect())` where `debugHandler=urllib.request.HTTPHandler()` and `debugHandler.set_http_debuglevel (1)`. In the end: `urllib.request.install_opener(opener)` – StashOfCode Jan 13 '20 at 13:05
5
I second olt's pointer to Dive into Python. Here's an implementation using urllib2 redirect handlers, more work than it should be? Maybe, shrug.
import sys
import urllib2
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_301(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_301(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Permanent Redirect: %s" % 301)
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_302(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Temporary Redirect: %s" % 302)
def main(script_name, url):
opener = urllib2.build_opener(RedirectHandler)
urllib2.install_opener(opener)
print urllib2.urlopen(url).read()
if __name__ == "__main__":
main(*sys.argv)

Aaron Maenpaa
- 119,832
- 11
- 95
- 108
-
4Looks wrong... This code does actually follow the redirects (by calling the original handler, thus issuing an HTTP request), and then raise an exception – Carles Barrobés Mar 18 '11 at 12:40