1

I know we can handle redirection with requests and urllib2 packages. I do not want to use requests as it is not a prebuilt package. Please help me with urllib2 to handle a URL which takes a 302 and then moves to 404. I am not concerned about the 404 and I would like to track whether it is a 301 or 302.

I referred this doc but it still throws the 404.

Here is my code

import urllib2
class My_HTTPRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)

my_opener = urllib2.build_opener(My_HTTPRedirectHandler)
urllib2.install_opener(my_opener)
response =urllib2.urlopen("MY URL")
print response.read()

Here is my response

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 438, in error
    result = self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "<stdin>", line 3, in http_error_302
  File "/usr/lib/python2.7/urllib2.py", line 625, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found    
Sathy
  • 303
  • 2
  • 8
  • 18
  • 1
    Please, show some code you already wrote. – maurelio79 Jan 13 '14 at 12:35
  • *"I am not concerned about the 404 and I would like to track whether it is a 301 or 302."* -- [the link that you provided](http://www.diveintopython.net/http_web_services/redirects.html) already contains the solution: [`SmartRedirectHandler`](http://www.diveintopython.net/http_web_services/redirects.html#oa.redirect.2.1), use `response.status` attribute to find out whether it is 301 or 302 redirect. – jfs Mar 02 '14 at 04:59

1 Answers1

0

It's better to use httplib and HTTP/1.1 HEAD method. That way no response body is received.

What’s the best way to get an HTTP response code from a URL?

Community
  • 1
  • 1
Sam Gleske
  • 950
  • 7
  • 19