2

I'm using a Python spider to crawl the internet using a urllib2 OpenerDirector. The problem is that a connection will inevitably hang on an https address, apparently ignoring the timeout value.

One solution would be to run it in a thread and then kill and restart the thread if it hangs. Apparently Python doesn't support killing threads and it's considered a Bad Idea because of garbage collection and other issues. This solution would be preferable to me however, because of the simplicity.

Another idea would be to use an asynchronous library like Twisted but that doesn't solve the problem.

I either need a way to force interrupt the call or fix the way the urllib2 OpenerDirector handles timeouts. Thanks.

2371
  • 957
  • 1
  • 6
  • 19

2 Answers2

2

Another StackOverflow question is similar here. When I faced something similar, I found it easier to convert what I was doing into defining & calling functions, which can subsequently return a value upon a timeout event. This can actually open up more possibility by utilizing various return values.

Another answer to the related question that I linked to above sounds more like what you're looking for (as I understand it): https://stackoverflow.com/a/5817436/1118357

Community
  • 1
  • 1
TCAllen07
  • 1,294
  • 16
  • 27
0

I suggest using another process instead of threads. like this:

from multiprocessing import Process

checker = Process(target=yourFunction, args=(some_queue))
timeout = 150
checker.start()
counter = 0
while checker.is_alive() == True:
        time.sleep(1)
        counter += 1
        if counter > timeout :
                print "Son process consumed too much run-time. Going to kill it!"
                kill(checker.pid)
                break

this way whatever happens the son process is killed after 150 seconds.

WeaselFox
  • 7,220
  • 8
  • 44
  • 75
  • also, checkout this question : [link](http://stackoverflow.com/questions/8464391/what-should-i-do-if-socket-setdefaulttimeout-is-not-working/8654421#8654421) – WeaselFox Jan 02 '12 at 07:49