29

The urllib2 documentation says that timeout parameter was added in Python 2.6. Unfortunately my code base has been running on Python 2.5 and 2.4 platforms.

Is there any alternate way to simulate the timeout? All I want to do is allow the code to talk the remote server for a fixed amount of time.

Perhaps any alternative built-in library? (Don't want install 3rd party, like pycurl)

rubayeet
  • 9,269
  • 8
  • 46
  • 55

6 Answers6

58

you can set a global timeout for all socket operations (including HTTP requests) by using:

socket.setdefaulttimeout()

like this:

import urllib2
import socket
socket.setdefaulttimeout(30)
f = urllib2.urlopen('http://www.python.org/')

in this case, your urllib2 request would timeout after 30 secs and throw a socket exception. (this was added in Python 2.3)

Corey Goldberg
  • 59,062
  • 28
  • 129
  • 143
  • `The urllib2 module has been split across several modules in Python 3.0 named urllib.request and urllib.error.` But the rest codes are simple enough. – MewX Dec 24 '15 at 15:35
4

With considerable irritation, you can override the httplib.HTTPConnection class that the urllib2.HTTPHandler uses.

def urlopen_with_timeout(url, data=None, timeout=None):

  # Create these two helper classes fresh each time, since
  # timeout needs to be in the closure.
  class TimeoutHTTPConnection(httplib.HTTPConnection):
    def connect(self):
      """Connect to the host and port specified in __init__."""
      msg = "getaddrinfo returns an empty list"
      for res in socket.getaddrinfo(self.host, self.port, 0,
                      socket.SOCK_STREAM): 
        af, socktype, proto, canonname, sa = res
        try:
          self.sock = socket.socket(af, socktype, proto)
          if timeout is not None:
            self.sock.settimeout(timeout)
          if self.debuglevel > 0:
            print "connect: (%s, %s)" % (self.host, self.port)
          self.sock.connect(sa)
        except socket.error, msg:
          if self.debuglevel > 0:
            print 'connect fail:', (self.host, self.port)
          if self.sock:
            self.sock.close()
          self.sock = None
          continue
        break
      if not self.sock:
        raise socket.error, msg

  class TimeoutHTTPHandler(urllib2.HTTPHandler):
    http_request = urllib2.AbstractHTTPHandler.do_request_
    def http_open(self, req):
      return self.do_open(TimeoutHTTPConnection, req)

  opener = urllib2.build_opener(TimeoutHTTPHandler)
  opener.open(url, data)
Philip Z
  • 174
  • 2
  • 3
2

I think your best choice is to patch (or deploy an local version of) your urllib2 with the change from the 2.6 maintenance branch

The file should be in /usr/lib/python2.4/urllib2.py (on linux and 2.4)

Kimvais
  • 38,306
  • 16
  • 108
  • 142
  • 1
    what about socket.settimeout()? Will it help? – rubayeet Jan 18 '10 at 09:15
  • I think it might, I had the same problem quite some time ago, and for some reason I couldn't get it to work. However, I have no recollection whatsoever where the code might be so cannot check :/ – Kimvais Jan 18 '10 at 11:15
1

You must set timeout in two places.

import urllib2
import socket

socket.setdefaulttimeout(30)
f = urllib2.urlopen('http://www.python.org/', timeout=30)
Daniel Magnusson
  • 9,541
  • 2
  • 38
  • 43
  • 2
    Both work independently. However timeout=30 works by itself. This was the best answer for me, so I removing the -1 you had. If you amend your answer's title to something "You may choose to set the timeout in one or both places". Also the main question tackles the issue of Python's version. – ruralcoder Oct 09 '12 at 00:33
1

I use httplib from the standard library. It has a dead simple API, but only handles http as you might guess. IIUC urllib uses httplib to implement the http stuff.

Kris Walker
  • 897
  • 1
  • 8
  • 14
0

Well, the way timeout is handled in either 2.4 or 2.6 is the same. If you open the urllib2.py file in 2.6 u would see that it takes an extra argument as timeout and handles it using the socket.defaulttimeout() method as mentioned is answer 1.

So you really need not update your urllib2.py in that case.

Konark Modi
  • 739
  • 6
  • 8