39

When urllib2.request reaches timeout, a urllib2.URLError exception is raised. What is the pythonic way to retry establishing a connection?

LinPy
  • 16,987
  • 4
  • 43
  • 57
iTayb
  • 12,373
  • 24
  • 81
  • 135
  • 1
    This question should answer yours: http://stackoverflow.com/questions/2712524/handling-urllib2s-timeout-python – Karl Barker Feb 25 '12 at 17:55
  • 3
    I didn't ask how to catch the expection. I wanted to know if there is a pythonic way to retry establish the connection. – iTayb Feb 25 '12 at 18:04
  • Sorry, I assumed the problem was in detecting the timeout had been reached, not in re-establising the connection. Could you not call urlopen() in the exception block? – Karl Barker Feb 25 '12 at 18:07
  • 2
    That is possible, but doesn't seem very pythonic. If I'd like to retry three times, I'll have to nest the try-except blocks, and it looks ugly. – iTayb Feb 25 '12 at 18:12
  • 1
    This might be of some help then: http://stackoverflow.com/questions/567622/is-there-a-pythonic-way-to-try-something-up-to-a-maximum-number-of-times – Karl Barker Feb 25 '12 at 18:24

4 Answers4

71

I would use a retry decorator. There are other ones out there, but this one works pretty well. Here's how you can use it:

@retry(urllib2.URLError, tries=4, delay=3, backoff=2)
def urlopen_with_retry():
    return urllib2.urlopen("http://example.com")

This will retry the function if URLError is raised. Check the link above for documentation on the parameters, but basically it will retry a maximum of 4 times, with an exponential backoff delay doubling each time, e.g. 3 seconds, 6 seconds, 12 seconds.

jterrace
  • 64,866
  • 22
  • 157
  • 202
  • 1
    This is a really cool snippet. Do you know an alternative, but as a context manager ? – Bite code Feb 25 '12 at 18:35
  • Hmm, I think you could probably rewrite it as a context manager pretty easily, but I don't have one offhand. – jterrace Feb 25 '12 at 18:36
  • It's no easy to do, since there is not easy way to capture the block inside the with statement. You need some deep introspection. – Bite code Feb 25 '12 at 20:13
  • No, I don't think that's true. Exceptions are re-raised inside a context manager after the yield. – jterrace Feb 25 '12 at 21:03
  • 2
    The problem is not the exception, but the code raising the exception. How do you retry a code if you can't run it ? There is no notion of anonymous bloc in Python. It's doable, but not intuitive. – Bite code Feb 26 '12 at 09:11
  • Ah, I see. It would be even harder if the calling code was not a with block. I have no idea how to do that. – jterrace Feb 26 '12 at 15:47
9

There are a few libraries out there that specialize in this.

One is backoff, which is designed with a particularly functional sensibility. Decorators are passed arbitrary callables returning generators which yield successive delay values. A simple exponential backoff with a maximum retry time of 32 seconds could be defined as:

@backoff.on_exception(backoff.expo,
                      urllib2.URLError,
                      max_value=32)
def url_open(url):
    return urllib2.urlopen("http://example.com")

Another is retrying which has very similar functionality but an API where retry parameters are specified by way of predefined keyword args.

bgreen-litl
  • 454
  • 5
  • 8
6

For Python3, you can use urllib3.Retry:

from urllib3 import Retry, PoolManager


retries = Retry(connect=5, read=2, redirect=5, backoff_factor=0.1)
http = PoolManager(retries=retries)
response = http.request('GET', 'http://example.com/')

If the backoff_factor is 0.1, then :func:.sleep will sleep for [0.0s, 0.2s, 0.4s, ...] between retries. It will never be longer than :attr:Retry.BACKOFF_MAX. urllib3 will sleep for::

        {backoff factor} * (2 ** ({number of total retries} - 1))
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
LinPy
  • 16,987
  • 4
  • 43
  • 57
5

To retry on timeout you could catch the exception as @Karl Barker suggested in the comment:

assert ntries >= 1
for i in range(1, ntries+1):
    try:
        page = urlopen(request, timeout=timeout)
        break # success
    except URLError as err:
        if i == ntries or not isinstance(err.reason, socket.timeout):
           raise # propagate last timeout or non-timeout errors
# use page here
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Unfortunately, `err` is not defined in the `else:` block, so if your code reaches the else block you'll get `UnboundLocalError: local variable 'err' referenced before assignment` – elboulangero Nov 15 '22 at 00:40
  • 1
    @elboulangero: I've updated the answer to raise inside the except clause. – jfs Nov 15 '22 at 17:21