96

I have

import urllib2
try:
   urllib2.urlopen("some url")
except urllib2.HTTPError:
   <whatever>

but what I end up is catching any kind of HTTP error. I want to catch only if the specified webpage doesn't exist (404?).

Piper
  • 1,266
  • 3
  • 15
  • 26
Arnab Sen Gupta
  • 5,639
  • 5
  • 24
  • 17
  • Have tried the recipe in this post? http://stackoverflow.com/questions/1308542/how-to-catch-404-error-in-urllib-urlretrieve – John P Jul 07 '10 at 08:29

4 Answers4

161

Python 3

from urllib.error import HTTPError

Python 2

from urllib2 import HTTPError

Just catch HTTPError, handle it, and if it's not Error 404, simply use raise to re-raise the exception.

See the Python tutorial.

Here is a complete example for Python 2:

import urllib2
from urllib2 import HTTPError
try:
   urllib2.urlopen("some url")
except HTTPError as err:
   if err.code == 404:
       <whatever>
   else:
       raise
Wouter
  • 534
  • 3
  • 14
  • 22
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • can i do urllib2.urlopen("*") to handle any 404 errors and route them to my 404.html page? –  Oct 01 '15 at 15:36
  • 1
    @TobiasKolb: Since the question is tagged `urllib2` (after all, it's over 9 years old) and `urllib3` is not part of the standard library, I think that wouldn't fit here. If there isn't a duplicate already, maybe open a new question? Or use `urllib` as outlined in Lazik's answer below. – Tim Pietzcker Oct 28 '19 at 21:27
  • I'm writing regression tests, so I want access to the urlopen response even if it was a 404. Even if I assign the value from the `urllib2.urlopen("some url")` , I can't use that value inside the exception -- it will cause another exception. So, how do I get the response text of the 404 page that was returned? – TaiwanGrapefruitTea Jul 10 '21 at 09:13
  • I found the answer: You can use the HTTPError instance as a response. https://docs.python.org/3/howto/urllib2.html#httperror – TaiwanGrapefruitTea Jul 10 '21 at 09:42
49

For Python 3.x

import urllib.request
import urllib.error
try:
    urllib.request.urlretrieve(url, fullpath)
except urllib.error.HTTPError as err:
    print(err.code)
Lazik
  • 2,480
  • 2
  • 25
  • 31
5

Tim's answer seems to me as misleading especially when urllib2 does not return the expected code. For example, this error will be fatal (believe or not - it is not uncommon one when downloading urls):

AttributeError: 'URLError' object has no attribute 'code'

Fast, but maybe not the best solution would be code using nested try/except block:

import urllib2
try:
    urllib2.urlopen("some url")
except urllib2.HTTPError as err:
    try:
        if err.code == 404:
            # Handle the error
        else:
            raise
    except:
        ...

More information to the topic of nested try/except blocks Are nested try/except blocks in python a good programming practice?

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
sonavolob
  • 354
  • 5
  • 8
0

If from urllib.error import HTTPError doesn't work, try using from requests.exceptions import HTTPError.

Sample:

from requests.exceptions import HTTPError

try:
    <access some url>
except HTTPError:
    # Handle the error like ususal
Jubin Ben
  • 21
  • 1