1

I have a small crawling application written in Python 2.7 that uses threads to fetch a lot of URLs. But it doesn't close cleanly or respond properly to a KeyboardInterrupt, although I tried to fix the latter issue with some advice I found here.

def main():
    ...
    for i in range(NUMTHREADS):
        worker = Thread(target=get_malware, args=(malq,dumpdir,))
        worker.setDaemon(True)
        worker.start()

    ...

    malq.join()


if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        sys.exit()

I need to make sure that it will exit properly when I hit Ctrl-C or when it completes its run rather than having to Ctrl-Z and kill the job.

Thanks!

Kyle Maxwell
  • 617
  • 6
  • 20

1 Answers1

0

There are discussions about how GIL could affect signal handling for Python applications with multiple threads that are IO bound. apparently IO bound threads cause the main thread starve for process time and not be able to handle signals as supposed to. I suggest looking at alternative parallel processing options (like subprocess module, or multiprocessing) or asynchronous frameworks (like asyncoro)

farzad
  • 8,775
  • 6
  • 32
  • 41
  • I've thought about switching to multiprocessing but not sure that heavyweight processes would help much. For some reason I was under the impression that those matter more for CPU bound applications; is that not correct? – Kyle Maxwell Feb 09 '13 at 04:05
  • this happens because there is no way to set threads priorities in Python (yet). IO bound threads wait for IO events. I recall these problems are common for application with multiple IO bound threads, and 1 (or more) CPU bound threads. You could also use an asynchronous solution if you want to avoid heavyweight processes. (http://docs.python.org/2/library/asyncore.html) – farzad Feb 09 '13 at 04:12
  • [Eventlet](http://eventlet.net/) could also let you avoid heavyweight processes and be asynchronous without changing *too* much of your code. – icktoofay Feb 09 '13 at 04:24