8

I'm using requests to get a URL, such as:

while True:
    try:
        rv = requests.get(url, timeout=1)
        doSth(rv)
    except socket.timeout as e:
        print e
    except Exception as e:
        print e

After it runs for a while, it quits working. No exception or any error, just like it suspended. I then stop the process by typing Ctrl+C from the console. It shows that the process is waiting for data:

  .............
   httplib_response = conn.getresponse(buffering=True) #httplib.py
   response.begin() #httplib.py
   version, status, reason = self._read_status() #httplib.py
   line = self.fp.readline(_MAXLINE + 1) #httplib.py
   data = self._sock.recv(self._rbufsize) #socket.py
KeyboardInterrupt

Why is this happening? Is there a solution?

martineau
  • 119,623
  • 25
  • 170
  • 301
bin381
  • 342
  • 1
  • 4
  • 14
  • 2
    It's waiting on the server to send data, nothing you can do on the client side. Perhaps the server is purposely throttling you? – Danielle M. Aug 30 '16 at 12:43
  • I think so.And I have to lower the frequency of requesting the data,right?any other suggestion – bin381 Aug 30 '16 at 13:01
  • Afait, that's your only option. Just make sure to keep latency in mind when setting your `timeout` - `10ms` for example, won't likely work. – deepbrook Aug 30 '16 at 13:52
  • 2
    Has this ever been answered satisfactorily? I just mean that I am having the same problem because I am using requests and the URL works fine with a browser(returning a json list to my browser), but when I 'get' the same URL with requests, it just hangs... –  Feb 26 '17 at 14:56
  • @Surest-Texas when I had this problem, I tried sending browser-like headers using UserAgent (https://stackoverflow.com/questions/27652543/how-to-use-python-requests-to-fake-a-browser-visit) and it still didn't work. Eventually I realised the browser was using https and all I had to do was set https instead of http in my `requests.get()` call... doh! – user2428107 Jul 30 '17 at 00:08
  • Thanks @user2428107, but i have no idea at this point.. been too long! what you say sounds reasonable, though. –  Jul 30 '17 at 02:50

2 Answers2

10

It appears that the server you're sending your request to is throttling you - that is, it's sending bytes with less than 1 second between each package (thus not triggering your timeout parameter), but slow enough for it to appear to be stuck.

The only fix for this I can think of is to reduce the timeout parameter, unless you can fix this throttling issue with the Server provider.

Do keep in mind that you'll need to consider latency when setting the timeout parameter, otherwise your connection will be dropped too quickly and might not work at all.

deepbrook
  • 2,523
  • 4
  • 28
  • 49
2

The default requests doesn't not set a timeout for connection or read. If for some reason, the server cannot get back to the client within the time, the client will stuck at connecting or read, mostly the read for the response.

The quick resolution is to set a timeout value in the requests object, the approach is well described here: http://docs.python-requests.org/en/master/user/advanced/#timeouts (Thanks to the guys.)

If this resolves the issue, please kindly mark this a resolution. Thanks.

Chao Zhang
  • 21
  • 2