How to get the code of the headers through urllib?
Asked
Active
Viewed 1.5e+01k times
4 Answers
192
The getcode() method (Added in python2.6) returns the HTTP status code that was sent with the response, or None if the URL is no HTTP URL.
>>> a=urllib.urlopen('http://www.google.com/asdfsf')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

Nadia Alramli
- 111,714
- 37
- 173
- 152
-
To use in python 3, just use `from urllib.request import urlopen`. – Nathanael Farley May 15 '16 at 20:17
-
6In python 3.4, if there is a 404, `urllib.request.urlopen` returns a `urllib.error.HTTPError`. – mcb May 10 '17 at 07:43
-
1Doesn't work in python 2.7. If the HTTP returns 400, an exception is thrown – Nathan B Feb 13 '19 at 14:32
91
You can use urllib2 as well:
import urllib2
req = urllib2.Request('http://www.python.org/fish.html')
try:
resp = urllib2.urlopen(req)
except urllib2.HTTPError as e:
if e.code == 404:
# do something...
else:
# ...
except urllib2.URLError as e:
# Not an HTTP-specific error (e.g. connection refused)
# ...
else:
# 200
body = resp.read()
Note that HTTPError
is a subclass of URLError
which stores the HTTP status code.

AndiDog
- 68,631
- 21
- 159
- 205

Joe Holloway
- 28,320
- 15
- 82
- 92
-
-
@NadavB The exception object 'e' will look like a response object. That is, it's file-like and you can 'read' the payload from it. – Joe Holloway Feb 14 '19 at 16:23
56
For Python 3:
import urllib.request, urllib.error
url = 'http://www.google.com/asdfsf'
try:
conn = urllib.request.urlopen(url)
except urllib.error.HTTPError as e:
# Return code error (e.g. 404, 501, ...)
# ...
print('HTTPError: {}'.format(e.code))
except urllib.error.URLError as e:
# Not an HTTP-specific error (e.g. connection refused)
# ...
print('URLError: {}'.format(e.reason))
else:
# 200
# ...
print('good')

Arpad Horvath -- Слава Україні
- 1,744
- 1
- 17
- 39

XavierCLL
- 1,163
- 10
- 12
-
For [URLError](https://docs.python.org/3.5/library/urllib.error.html) `print(e.reason)` could be used. – Gitnik Aug 04 '17 at 20:23
-
-
6
6
import urllib2
try:
fileHandle = urllib2.urlopen('http://www.python.org/fish.html')
data = fileHandle.read()
fileHandle.close()
except urllib2.URLError, e:
print 'you got an error with the code', e

mrme
- 77
- 1
- 1
-
7TIMEX is interested in grabbing the http request code (200, 404, 500, etc) not a generic error thrown by urllib2. – Joshua Burns Jul 09 '12 at 15:34