I have an API manager that connects to an URL and grabs some json. Very simple. Cut from the method:
req = Request(url)
socket.setdefaulttimeout(timeout)
resp = urlopen(req, None, timeout)
data = resp.read()
resp.close()
It works fine most of the time, but at random intervals it takes 5 s to complete the request. Even when timeout is set to 0.5 or 1.0 or whatever. I have logged it very closely so I am 100% sure that the line that takes time is number #3 (ie. resp = urlopen(req, None, timeout)).
Ive tried all solutions Ive found on the topic of timeout decorators and Timers etc. (To list some of them: Python urllib2.urlopen freezes script infinitely even though timeout is set, How can I force urllib2 to time out?, Timing out urllib2 urlopen operation in Python 2.4, Timeout function if it takes too long to finish )
But nothing works. My impression is that the thread freezes while urlopen does something and when its done it unfreezes and then all the timers and timeouts returns w timeout errors. but the execution time is still more then 5s.
I've found this old mailing list regarding urllib2 and handling of chunked encoding. So if the problem is still present then the solution might be to write a custom urlopen based on httplib.HTTP and not httplib.HTTPConnection. Another possible solution is to try some multithreading magic....
Both solutions seem to aggresive. And it bugs me that the timeout does not work all the way.
It is very important that the execution time of the script does not exceed 0.5s. Anyone that knows why I am experiencing the freezes or maybe a way to help me?
Update based on accepted answer: I changed the approach and use curl instead. Together w unix timeout it works just as I want. Example code follows:
t_timeout = str(API_TIMEOUT_TIME)
c_timeout = str(CURL_TIMEOUT_TIME)
cmd = ['timeout', t_timeout, 'curl', '--max-time', c_timeout, url]
prc = Popen(cmd, stdout=PIPE, stderr=PIPE)
response = prc.communicate()
Since curl only accepts int as timeout I added timeout. timeout accepts floats.