321

I'm gathering statistics on a list of websites and I'm using requests for it for simplicity. Here is my code:

data=[]
websites=['http://google.com', 'http://bbc.co.uk']
for w in websites:
    r= requests.get(w, verify=False)
    data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )

Now, I want requests.get to timeout after 10 seconds so the loop doesn't get stuck.

This question has been of interest before too but none of the answers are clean.

I hear that maybe not using requests is a good idea but then how should I get the nice things requests offer (the ones in the tuple).

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
Kiarash
  • 7,378
  • 10
  • 44
  • 69
  • possible duplicate of [How to perform time limited response download with python requests?](http://stackoverflow.com/questions/13573146/how-to-perform-time-limited-response-download-with-python-requests) – yprez Sep 20 '15 at 18:12
  • related: [Read timeout using either urllib2 or any other http library](http://stackoverflow.com/q/9548869/4279) – jfs Oct 29 '15 at 02:13

22 Answers22

563

Note: The timeout param does NOT prevent the request from loading forever, it only stops if the remote server fails to send response data within the timeout value. It could still load indefinitely.

Set the timeout parameter:

try:
    r = requests.get("MYURL.com", timeout=10) # 10 seconds
except requests.exceptions.Timeout:
    print("Timed out")

The code above will cause the call to requests.get() to timeout if the connection or delays between reads takes more than ten seconds.

The timeout parameter accepts the number of seconds to wait as a float, as well as a (connect timeout, read timeout) tuple.

See requests.request documentation as well as the timeout section of the "Advanced Usage" section of the documentation.

Alan Hamlett
  • 3,160
  • 1
  • 23
  • 23
Lukasa
  • 14,599
  • 4
  • 32
  • 34
  • 56
    That is not for the entire response. http://requests.readthedocs.org/en/latest/user/quickstart/#timeouts – Kiarash Feb 23 '14 at 17:38
  • 1
    Yes it is, in some circumstances. One of those circumstances happens to be yours. =) I invite you to look at the code if you're not convinced. – Lukasa Feb 23 '14 at 20:57
  • what are the circumstances? – Kiarash Feb 23 '14 at 23:12
  • 2
    I just checked this and it never stopped: r = requests.get('http://ipv4.download.thinkbroadband.com/1GB.zip', timeout = 20) – Kiarash Feb 23 '14 at 23:19
  • 7
    Ah, sorry, I misunderstood what you meant when you said 'the entire response'. Yes, you're right: it's not an upper limit on the total amount of time to wait. – Lukasa Feb 24 '14 at 07:03
  • I think OP wants deadline behaviour, similar to `ping` and `wget` – Dima Tisnek Feb 25 '14 at 22:33
  • I use `stream=True` and `requests.iter_content(1024)` in addition to `timeout=`. This should work even for `1GB.zip`. I posted my full answer below. – Polv Feb 28 '18 at 03:45
  • What happens when the timeout is reached? is a response value returned, is an error raised, etc. – Glen Thomas Mar 29 '22 at 13:19
  • @GlenThomas a `requests.exceptions.Timeout` is raised. See [DaWe's answer](https://stackoverflow.com/a/56941215/1719931) – robertspierre Jun 18 '23 at 07:10
165

What about using eventlet? If you want to timeout the request after 10 seconds, even if data is being received, this snippet will work for you:

import requests
import eventlet
eventlet.monkey_patch()

with eventlet.Timeout(10):
    requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip", verify=False)
Alvaro
  • 2,227
  • 1
  • 14
  • 11
  • 189
    Surely this is unnecessarily complicated. – holdenweb Feb 28 '14 at 14:00
  • 2
    Why is this unappreciated @Alvaro ? I just looked up eventlet and there is an example in the bottom of their page, very similar to what I'm trying to do ?! http://eventlet.net/ – Kiarash Feb 28 '14 at 20:11
  • I meant the last comment for @holdenweb – Kiarash Feb 28 '14 at 20:36
  • It's not unappreciated, but your solution involves importing a third-party module. It just seems simpler, given that only one socket appears to be active at any time, to set the default timeout in the `socket` module, which is anyway resident for anything that does networking. Unless there is some compelling advantage of which i am unaware, which would not surprise me unduly. – holdenweb Feb 28 '14 at 21:29
  • 2
    @holdenweb importing a module is complicated? This involves importing a third-party module, just as the request module. Setting the default timeout will work when a read timeout happens, but as I understood the question, the user wants to timeout the request after N seconds, regardless of a read timeout or not. – Alvaro Mar 01 '14 at 17:21
  • 8
    Thank you. I now understand your solution's technical superiority (which you stated rather succinctly at the beginning of your answer) and upvoted it. The issue with third-party modules is not importing them but ensuring they are there to be imported, hence my own preference for using the standard library where possible. – holdenweb Mar 02 '14 at 05:23
  • @holdenweb, thanks, indeed when I reviewed my original posting I realised that my explanation was not complete enough. – Alvaro Mar 02 '14 at 15:07
  • 1
    Indeed using a setdefaulttimeout is a good way to go: I have an example shown here --> [link](http://stackoverflow.com/questions/19412669/python-socket-library-thinks-socket-is-open-when-its-not/20193533#20193533) as @holdenweb mentioned, its an option i use frequently. – Div Mar 03 '14 at 23:37
  • This is a good answer, but it throws in an unnecessary dependency. You can get the exact same syntax using signals, see my answer. – totokaka Mar 04 '14 at 07:17
  • very nice hack! +1! I guess name resolution is not covered by timeout as well as any processing in a C library, e.g. ssl processing for https requests, but the elegance beats any concerns I have :P – Dima Tisnek Mar 11 '14 at 08:48
  • 10
    Is `eventlet.monkey_patch()` required? – User Jun 12 '15 at 16:06
  • 3
    Yes, the `socket` module needs to be monkey patched, so at least you'll need a `eventlet.monkey_patch(socket=True)` – Alvaro Jun 16 '15 at 14:11
  • 115
    As of ***2018*** this answer is outdated. Use **`requests.get('https://github.com', timeout=5) `** – Pedro Lobito Apr 06 '18 at 14:15
  • 7
    [This comment](https://github.com/requests/requests/issues/3099#issuecomment-215498005) from a requests developer has a good explanation of why requests does not have a total-response-time timeout, and what they suggest instead. – Christian Long Sep 14 '18 at 14:14
  • 1
    Not ideal at all. See answer below. Also didn't work at all for me. – user2924019 Oct 18 '18 at 20:56
  • 1
    I have a RecursionError: maximum recursion depth exceeded when I use this package. It does not working for me – toshiro92 Mar 21 '19 at 09:30
  • For some reason, this is the only solution that worked for me. – yshahak Jul 03 '19 at 18:39
  • 1
    The combination of Python 3.7 + requests + eventlet.Timeout seems not to work anymore (either `RecursionError` or `TypeError: wrap_socket() got an unexpected keyword argument '_context'` depending on whether I use `socket=True` or `socket=False` for eventlet monkey patch). It was fine with earlier versions of Python... Good news however, I was successful in implementing the signal based solution from user totokaka below (https://stackoverflow.com/a/22156618/8046487) – Mathieu Rey Nov 08 '19 at 11:39
  • @PedroLobito's comment is wrong - last time I read requests' documentation, the `timeout` parameter applied to every read, which means that it *might* timeout the request after that many seconds, or in the pathological case it could still basically sit there *indefinitely* (if every <5 seconds the server sends a byte). – mtraceur Jan 21 '20 at 18:56
  • @PedroLobito **more importantly**, the `timeout` argument that requests provide does not (as of when I last looked into it) apply to things like DNS lookups - so if the DNS lookup to "example.com" hangs because of some problem (which I have seen happen in production on systems I have worked at) then it will still hang indefinitely regardless of the timeout parameter. I don't know for sure if the provided answer actually solves this problem, but since eventlet invasively monkeypatches, it seems very likely that it would. – mtraceur Jan 21 '20 at 19:01
  • Use `eventlet.monkey_patch(thread=False)` to make Processes work well if someone is using `multiprocessing` module at the same time. – Hansimov Apr 27 '20 at 07:15
  • Only `eventlet.monkey_patch(thread=True)` ran in Python 3.8.6 (tags/v3.8.6:db45529, Sep 23 2020, 15:52:53) [MSC v.1927 64 bit (AMD64)] on win32, but conflicted with VSCode's debugger and didn't work. – Cees Timmerman Feb 03 '21 at 22:28
146

UPDATE: https://requests.readthedocs.io/en/master/user/advanced/#timeouts

In new version of requests:

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.

r = requests.get('https://github.com', timeout=None)

My old (probably outdated) answer (which was posted long time ago):

There are other ways to overcome this problem:

1. Use the TimeoutSauce internal class

From: https://github.com/kennethreitz/requests/issues/1928#issuecomment-35811896

import requests from requests.adapters import TimeoutSauce

class MyTimeout(TimeoutSauce):
    def __init__(self, *args, **kwargs):
        connect = kwargs.get('connect', 5)
        read = kwargs.get('read', connect)
        super(MyTimeout, self).__init__(connect=connect, read=read)

requests.adapters.TimeoutSauce = MyTimeout

This code should cause us to set the read timeout as equal to the connect timeout, which is the timeout value you pass on your Session.get() call. (Note that I haven't actually tested this code, so it may need some quick debugging, I just wrote it straight into the GitHub window.)

2. Use a fork of requests from kevinburke: https://github.com/kevinburke/requests/tree/connect-timeout

From its documentation: https://github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

kevinburke has requested it to be merged into the main requests project, but it hasn't been accepted yet.

disco crazy
  • 31,313
  • 12
  • 80
  • 83
Hieu
  • 7,138
  • 2
  • 42
  • 34
  • option 1 doesn't work. if you continue reading that thread, other people have said "this won't work for your use-case, I'm afraid. The read timeout function is at the scope of an individual socket recv() call, so that if the server stops sending data for more than the read timeout we'll abort." – Kiarash Mar 13 '14 at 15:28
  • There is another nice solution in that thread using Signal, which wouldn't work for me either, because I use Windows and signal.alarm is linux only. – Kiarash Mar 13 '14 at 15:28
  • @Kiarash I haven't tested it yet. However, as I understand when Lukasa said `this won't work for you use-case`. He meant it doesn't work with mp3 stream which is wanted by the other guy. – Hieu Mar 14 '14 at 14:07
  • 1
    @Hieu - this was merged in another pull request - https://github.com/kennethreitz/requests/pull/2176#discussion-diff-16632478 – yprez Sep 20 '15 at 18:08
  • timeout=None is not blocking the call. – crazydan Sep 07 '19 at 17:39
  • The 1st link is 404 – Pedro Lobito Apr 18 '20 at 08:39
  • (1) The link is dead. Now https://requests.readthedocs.io/en/master/user/advanced/#timeouts (2) JFYI, `timeout=None` is same as not specifying `timeout`. Source: The website says "*By default, requests do not time out unless a timeout value is set explicitly.*" – ynn Jun 22 '20 at 14:33
  • That is not for the entire response. https://requests.readthedocs.org/en/latest/user/quickstart/#timeouts – Klaas van Schelven Aug 26 '22 at 08:00
76

Since requests >= 2.4.0, you can use the timeout argument, i.e:

requests.get('https://duckduckgo.com/', timeout=10)

You can also provide a tuple to specify connect and the read timeouts separately:

requests.get('https://duckduckgo.com/', timeout=(5, 8.5))

a None timeout will wait forever (not recommended)


Note:

timeout is not a time limit on the entire response download; rather, an exception is raised if the server has not issued a response for timeout seconds ( more precisely, if no bytes have been received on the underlying socket for timeout seconds). If no timeout is specified explicitly, requests do not time out.

egor83
  • 1,199
  • 1
  • 13
  • 26
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • What version of requests has the new timeout parameter? – Rusty Jan 15 '19 at 21:08
  • 2
    Seems to be since version 2.4.0 : *Support for connect timeouts! Timeout now accepts a tuple (connect, read) which is used to set individual connect and read timeouts*. https://pypi.org/project/requests/2.4.0/ – Pedro Lobito Jan 17 '19 at 13:45
27

To create a timeout you can use signals.

The best way to solve this case is probably to

  1. Set an exception as the handler for the alarm signal
  2. Call the alarm signal with a ten second delay
  3. Call the function inside a try-except-finally block.
  4. The except block is reached if the function timed out.
  5. In the finally block you abort the alarm, so it's not singnaled later.

Here is some example code:

import signal
from time import sleep

class TimeoutException(Exception):
    """ Simple Exception to be called on timeouts. """
    pass

def _timeout(signum, frame):
    """ Raise an TimeoutException.

    This is intended for use as a signal handler.
    The signum and frame arguments passed to this are ignored.

    """
    # Raise TimeoutException with system default timeout message
    raise TimeoutException()

# Set the handler for the SIGALRM signal:
signal.signal(signal.SIGALRM, _timeout)
# Send the SIGALRM signal in 10 seconds:
signal.alarm(10)

try:    
    # Do our code:
    print('This will take 11 seconds...')
    sleep(11)
    print('done!')
except TimeoutException:
    print('It timed out!')
finally:
    # Abort the sending of the SIGALRM signal:
    signal.alarm(0)

There are some caveats to this:

  1. It is not threadsafe, signals are always delivered to the main thread, so you can't put this in any other thread.
  2. There is a slight delay after the scheduling of the signal and the execution of the actual code. This means that the example would time out even if it only slept for ten seconds.

But, it's all in the standard python library! Except for the sleep function import it's only one import. If you are going to use timeouts many places You can easily put the TimeoutException, _timeout and the singaling in a function and just call that. Or you can make a decorator and put it on functions, see the answer linked below.

You can also set this up as a "context manager" so you can use it with the with statement:

import signal
class Timeout():
    """ Timeout for use with the `with` statement. """

    class TimeoutException(Exception):
        """ Simple Exception to be called on timeouts. """
        pass

    def _timeout(signum, frame):
        """ Raise an TimeoutException.

        This is intended for use as a signal handler.
        The signum and frame arguments passed to this are ignored.

        """
        raise Timeout.TimeoutException()

    def __init__(self, timeout=10):
        self.timeout = timeout
        signal.signal(signal.SIGALRM, Timeout._timeout)

    def __enter__(self):
        signal.alarm(self.timeout)

    def __exit__(self, exc_type, exc_value, traceback):
        signal.alarm(0)
        return exc_type is Timeout.TimeoutException

# Demonstration:
from time import sleep

print('This is going to take maximum 10 seconds...')
with Timeout(10):
    sleep(15)
    print('No timeout?')
print('Done')

One possible down side with this context manager approach is that you can't know if the code actually timed out or not.

Sources and recommended reading:

Alex Peters
  • 2,601
  • 1
  • 22
  • 29
totokaka
  • 2,244
  • 1
  • 21
  • 33
  • 3
    Signals are only delivered in the main thread, thus it *defnitely* won't work in other threads, not *probably*. – Dima Tisnek Mar 11 '14 at 08:49
  • 1
    The [timeout-decorator](https://github.com/pnpnpn/timeout-decorator) package provides a timeout decorator that uses signals (or optionally multiprocessing). – Christian Long Sep 13 '18 at 19:33
25

In 2023, most other answers are incorrect. You will not achieve what you want.

TL;DR - the proper solution, condensed

import requests, sys, time

TOTAL_TIMEOUT = 10

def trace_function(frame, event, arg):
    if time.time() - start > TOTAL_TIMEOUT:
        raise Exception('Timed out!')

    return trace_function

start = time.time()
sys.settrace(trace_function)

try:
    res = requests.get('http://localhost:8080', timeout=(3, 6))
except:
    raise
finally:
    sys.settrace(None)

Read the explanation to understand why!

Despite all the answers, I believe that this thread still lacks a proper solution and no existing answer presents a reasonable way to do something which should be simple and obvious.

Let's start by saying that as of 2023, there is still absolutely no way to do it properly with requests alone. It is a concious design decision by the library's developers.

Solutions utilizing the timeout parameter simply do not accomplish what they intend to do. The fact that it "seems" to work at the first glance is purely incidental:

The timeout parameter has absolutely nothing to do with the total execution time of the request. It merely controls the maximum amount of time that can pass before underlying socket receives any data. With an example timeout of 5 seconds, server can just as well send 1 byte of data every 4 seconds and it will be perfectly okay, but won't help you very much.

Answers with stream and iter_content are somewhat better, but they still do not cover everything in a request. You do not actually receive anything from iter_content until after response headers are sent, which falls under the same issue - even if you use 1 byte as a chunk size for iter_content, reading full response headers could take a totally arbitrary amount of time and you can never actually get to the point in which you read any response body from iter_content.

Here are some examples that completely break both timeout and stream-based approach. Try them all. They all hang indefinitely, no matter which method you use.

server.py

import socket
import time

server = socket.socket()

server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)
server.bind(('127.0.0.1', 8080))

server.listen()

while True:
    try:
        sock, addr = server.accept()
        print('Connection from', addr)
        sock.send(b'HTTP/1.1 200 OK\r\n')

        # Send some garbage headers very slowly but steadily.
        # Never actually complete the response.

        while True:
            sock.send(b'a')
            time.sleep(1)
    except:
        pass

demo1.py

import requests

requests.get('http://localhost:8080')

demo2.py

import requests

requests.get('http://localhost:8080', timeout=5)

demo3.py

import requests

requests.get('http://localhost:8080', timeout=(5, 5))

demo4.py

import requests

with requests.get('http://localhost:8080', timeout=(5, 5), stream=True) as res:
    for chunk in res.iter_content(1):
        break

The proper solution

My approach utilizes Python's sys.settrace function. It is dead simple. You do not need to use any external libraries or turn your code upside down. Unlike most other answers, this actually guarantees that the code executes in specified time. Be aware that you still need to specify the timeout parameter, as settrace only concerns Python code. Actual socket reads are external syscalls which are not covered by settrace, but are covered by the timeout parameter. Due to this fact, the exact time limit is not TOTAL_TIMEOUT, but a value which is explained in comments below.

import requests
import sys
import time

# This function serves as a "hook" that executes for each Python statement
# down the road. There may be some performance penalty, but as downloading
# a webpage is mostly I/O bound, it's not going to be significant.

def trace_function(frame, event, arg):
    if time.time() - start > TOTAL_TIMEOUT:
        raise Exception('Timed out!') # Use whatever exception you consider appropriate.

    return trace_function

# The following code will terminate at most after TOTAL_TIMEOUT + the highest
# value specified in `timeout` parameter of `requests.get`.
# In this case 10 + 6 = 16 seconds.
# For most cases though, it's gonna terminate no later than TOTAL_TIMEOUT.

TOTAL_TIMEOUT = 10

start = time.time()

sys.settrace(trace_function)

try:
    res = requests.get('http://localhost:8080', timeout=(3, 6)) # Use whatever timeout values you consider appropriate.
except:
    raise
finally:
    sys.settrace(None) # Remove the time constraint and continue normally.

# Do something with the response

That's it!

Tim Colorado
  • 351
  • 3
  • 3
  • 1
    Thanks, this worked great (Python 3.10). I didn't even know you could manipulate stack frames in a custom source code debugger using [`sys.settrace()`](https://docs.python.org/3/library/sys.html#sys.settrace). – Splines Jul 22 '22 at 12:56
22

Try this request with timeout & error handling:

import requests
try: 
    url = "http://google.com"
    r = requests.get(url, timeout=10)
except requests.exceptions.Timeout as e: 
    print e
DaWe
  • 1,422
  • 16
  • 26
  • If the site is unreachable, it throws `requests.exceptions.ConnectionError` instead of `Timeout`, so you might want to catch a more generic `RequestException` instead. – egor83 Aug 24 '23 at 07:53
14

The connect timeout is the number of seconds Requests will wait for your client to establish a connection to a remote machine (corresponding to the connect()) call on the socket. It’s a good practice to set connect timeouts to slightly larger than a multiple of 3, which is the default TCP packet retransmission window.

Once your client has connected to the server and sent the HTTP request, the read timeout started. It is the number of seconds the client will wait for the server to send a response. (Specifically, it’s the number of seconds that the client will wait between bytes sent from the server. In 99.9% of cases, this is the time before the server sends the first byte).

If you specify a single value for the timeout, The timeout value will be applied to both the connect and the read timeouts. like below:

r = requests.get('https://github.com', timeout=5)

Specify a tuple if you would like to set the values separately for connect and read:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.

r = requests.get('https://github.com', timeout=None)

https://docs.python-requests.org/en/latest/user/advanced/#timeouts

Alexander C
  • 3,597
  • 1
  • 23
  • 39
6

Set stream=True and use r.iter_content(1024). Yes, eventlet.Timeout just somehow doesn't work for me.

try:
    start = time()
    timeout = 5
    with get(config['source']['online'], stream=True, timeout=timeout) as r:
        r.raise_for_status()
        content = bytes()
        content_gen = r.iter_content(1024)
        while True:
            if time()-start > timeout:
                raise TimeoutError('Time out! ({} seconds)'.format(timeout))
            try:
                content += next(content_gen)
            except StopIteration:
                break
        data = content.decode().split('\n')
        if len(data) in [0, 1]:
            raise ValueError('Bad requests data')
except (exceptions.RequestException, ValueError, IndexError, KeyboardInterrupt,
        TimeoutError) as e:
    print(e)
    with open(config['source']['local']) as f:
        data = [line.strip() for line in f.readlines()]

The discussion is here https://redd.it/80kp1h

Polv
  • 1,918
  • 1
  • 20
  • 31
  • it's a shame request doesn't support maxtime params, this solution is the only one worked with asyncio – wukong Aug 21 '19 at 17:23
5

This may be overkill, but the Celery distributed task queue has good support for timeouts.

In particular, you can define a soft time limit that just raises an exception in your process (so you can clean up) and/or a hard time limit that terminates the task when the time limit has been exceeded.

Under the covers, this uses the same signals approach as referenced in your "before" post, but in a more usable and manageable way. And if the list of web sites you are monitoring is long, you might benefit from its primary feature -- all kinds of ways to manage the execution of a large number of tasks.

Chris Johnson
  • 20,650
  • 6
  • 81
  • 80
  • This could be a good solution. The problem of total timeout is not related directly to `python-requests` but to `httplib` (used by requests for Python 2.7). The package passes everything related to `timeout` directly to httplib. I think than nothing can be fixed in request because the process can stay for long time in httplib. – hynekcer Feb 27 '14 at 14:18
  • @hynekcer, I think you are right. This is why detecting timeouts out-of-process and enforcing by cleanly killing processes, as Celery does, can be a good approach. – Chris Johnson Feb 27 '14 at 17:19
5

I believe you can use multiprocessing and not depend on a 3rd party package:

import multiprocessing
import requests

def call_with_timeout(func, args, kwargs, timeout):
    manager = multiprocessing.Manager()
    return_dict = manager.dict()

    # define a wrapper of `return_dict` to store the result.
    def function(return_dict):
        return_dict['value'] = func(*args, **kwargs)

    p = multiprocessing.Process(target=function, args=(return_dict,))
    p.start()

    # Force a max. `timeout` or wait for the process to finish
    p.join(timeout)

    # If thread is still active, it didn't finish: raise TimeoutError
    if p.is_alive():
        p.terminate()
        p.join()
        raise TimeoutError
    else:
        return return_dict['value']

call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)

The timeout passed to kwargs is the timeout to get any response from the server, the argument timeout is the timeout to get the complete response.

Jorge Leitao
  • 19,085
  • 19
  • 85
  • 121
  • This can be improved with a generic try/except in the private function that catches all errors and puts them in return_dict['error']. Then at the end, before returning, check if 'error' in return_dict and then raise it. It makes it much easier to test as well. – dialt0ne May 12 '16 at 02:30
2

Despite the question being about requests, I find this very easy to do with pycurl CURLOPT_TIMEOUT or CURLOPT_TIMEOUT_MS.

No threading or signaling required:

import pycurl
import StringIO

url = 'http://www.example.com/example.zip'
timeout_ms = 1000
raw = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(pycurl.TIMEOUT_MS, timeout_ms)  # total timeout in milliseconds
c.setopt(pycurl.WRITEFUNCTION, raw.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, url)
c.setopt(pycurl.HTTPGET, 1)
try:
    c.perform()
except pycurl.error:
    traceback.print_exc() # error generated on timeout
    pass # or just pass if you don't want to print the error
John Smith
  • 21
  • 2
2

In case you're using the option stream=True you can do this:

r = requests.get(
    'http://url_to_large_file',
    timeout=1,  # relevant only for underlying socket
    stream=True)

with open('/tmp/out_file.txt'), 'wb') as f:
    start_time = time.time()
    for chunk in r.iter_content(chunk_size=1024):
        if chunk:  # filter out keep-alive new chunks
            f.write(chunk)
        if time.time() - start_time > 8:
            raise Exception('Request took longer than 8s')

The solution does not need signals or multiprocessing.

ub_marco
  • 140
  • 1
  • 5
  • 2
    this wont work if the target server stops streaming data. you will be locked forever at the ```iter``` line. such situations happen when your auth session expires for example. – ulkas Nov 09 '21 at 11:29
2

Just another one solution (got it from http://docs.python-requests.org/en/master/user/advanced/#streaming-uploads)

Before upload you can find out the content size:

TOO_LONG = 10*1024*1024  # 10 Mb
big_url = "http://ipv4.download.thinkbroadband.com/1GB.zip"
r = requests.get(big_url, stream=True)
print (r.headers['content-length'])
# 1073741824  

if int(r.headers['content-length']) < TOO_LONG:
    # upload content:
    content = r.content

But be careful, a sender can set up incorrect value in the 'content-length' response field.

Denis Kuzin
  • 863
  • 2
  • 10
  • 18
2

timeout = (connection timeout, data read timeout) or give a single argument(timeout=1)

import requests

try:
    req = requests.request('GET', 'https://www.google.com',timeout=(1,1))
    print(req)
except requests.ReadTimeout:
    print("READ TIME OUT")
1

this code working for socketError 11004 and 10060......

# -*- encoding:UTF-8 -*-
__author__ = 'ACE'
import requests
from PyQt4.QtCore import *
from PyQt4.QtGui import *


class TimeOutModel(QThread):
    Existed = pyqtSignal(bool)
    TimeOut = pyqtSignal()

    def __init__(self, fun, timeout=500, parent=None):
        """
        @param fun: function or lambda
        @param timeout: ms
        """
        super(TimeOutModel, self).__init__(parent)
        self.fun = fun

        self.timeer = QTimer(self)
        self.timeer.setInterval(timeout)
        self.timeer.timeout.connect(self.time_timeout)
        self.Existed.connect(self.timeer.stop)
        self.timeer.start()

        self.setTerminationEnabled(True)

    def time_timeout(self):
        self.timeer.stop()
        self.TimeOut.emit()
        self.quit()
        self.terminate()

    def run(self):
        self.fun()


bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip")

a = QApplication([])

z = TimeOutModel(bb, 500)
print 'timeout'

a.exec_()
ACEE
  • 23
  • 7
1

Well, I tried many solutions on this page and still faced instabilities, random hangs, poor connections performance.

I'm now using Curl and i'm really happy about it's "max time" functionnality and about the global performances, even with such a poor implementation :

content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')

Here, I defined a 6 seconds max time parameter, englobing both connection and transfer time.

I'm sure Curl has a nice python binding, if you prefer to stick to the pythonic syntax :)

technico
  • 1,192
  • 1
  • 12
  • 22
1

There is a package called timeout-decorator that you can use to time out any python function.

@timeout_decorator.timeout(5)
def mytest():
    print("Start")
    for i in range(1,10):
        time.sleep(1)
        print("{} seconds have passed".format(i))

It uses the signals approach that some answers here suggest. Alternatively, you can tell it to use multiprocessing instead of signals (e.g. if you are in a multi-thread environment).

Christian Long
  • 10,385
  • 6
  • 60
  • 58
1

The biggest problem is that if the connection can't be established, the requests package waits too long and blocks the rest of the program.

There are several ways how to tackle the problem but when I looked for a oneliner similar to requests, I couldn't find anything. That's why I built a wrapper around requests called reqto ("requests timeout"), which supports proper timeout for all standard methods from requests.

pip install reqto

The syntax is identical to requests

import reqto

response = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=1)
# Will raise an exception on Timeout
print(response)

Moreover, you can set up a custom timeout function

def custom_function(parameter):
    print(parameter)


response = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=5,timeout_function=custom_function,timeout_args="Timeout custom function called")
#Will call timeout_function instead of raising an exception on Timeout
print(response)

Important note is that the import line

import reqto

needs to be earlier import than all other imports working with requests, threading, etc. due to monkey_patch which runs in the background.

DovaX
  • 958
  • 11
  • 16
0

If it comes to that, create a watchdog thread that messes up requests' internal state after 10 seconds, e.g.:

  • closes the underlying socket, and ideally
  • triggers an exception if requests retries the operation

Note that depending on the system libraries you may be unable to set deadline on DNS resolution.

Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
0

I'm using requests 2.2.1 and eventlet didn't work for me. Instead I was able use gevent timeout instead since gevent is used in my service for gunicorn.

import gevent
import gevent.monkey
gevent.monkey.patch_all(subprocess=True)
try:
    with gevent.Timeout(5):
        ret = requests.get(url)
        print ret.status_code, ret.content
except gevent.timeout.Timeout as e:
    print "timeout: {}".format(e.message)

Please note that gevent.timeout.Timeout is not caught by general Exception handling. So either explicitly catch gevent.timeout.Timeout or pass in a different exception to be used like so: with gevent.Timeout(5, requests.exceptions.Timeout): although no message is passed when this exception is raised.

greggmi
  • 445
  • 4
  • 14
-1

I came up with a more direct solution that is admittedly ugly but fixes the real problem. It goes a bit like this:

resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.settimeout(read_timeout)
# This will load the entire response even though stream is set
content = resp.content

You can read the full explanation here

Community
  • 1
  • 1
Realistic
  • 1,038
  • 1
  • 10
  • 20
  • 3
    1- because [you can pass `timeout` parameter to `requests.get()`](http://stackoverflow.com/a/21966169/4279) without ugly workarounds 2- though [both won't limit the total timeout](http://stackoverflow.com/a/32684677/4279) unlike [`eventlet.Timeout(10)`](http://stackoverflow.com/a/22096841/4279) – jfs Feb 08 '16 at 10:29