0

I've read a lot of questions on SO and elsewhere on this topic but can't get it working. Perhaps it's because I'm using Windows, I don't know.

What I'm trying to do is download a bunch of files (whose URLs are read from a CSV file) in parallel. I've tried using multiprocessing and concurrent.futures for this with no success.

The main problem is that I can't stop the program on Ctrl-C - it just keeps running. This is especially bad in the case of processes instead of threads (I used multiprocessing for that) because I have to kill each process manually every time.

Here is my current code:

import concurrent.futures
import signal
import sys
import urllib.request

class Download(object):
  def __init__(self, url, filename):
    self.url = url
    self.filename = filename

def perform_download(download):
  print('Downloading {} to {}'.format(download.url, download.filename))
  return urllib.request.urlretrieve(download.url, filename=download.filename)  

def main(argv):
  args = parse_args(argv)
  queue = []
  with open(args.results_file, 'r', encoding='utf8') as results_file:
    # Irrelevant CSV parsing...
    queue.append(Download(url, filename))

  def handle_interrupt():
    print('CAUGHT SIGINT!!!!!!!!!!!!!!!!!!!11111111')
    sys.exit(1)

  signal.signal(signal.SIGINT, handle_interrupt)

  with concurrent.futures.ThreadPoolExecutor(max_workers=args.num_jobs) as executor:
    futures = {executor.submit(perform_download, d): d for d in queue}
    try:
      concurrent.futures.wait(futures)
    except KeyboardInterrupt:
      print('Interrupted')
      sys.exit(1)

I'm trying to catch Ctrl-C in two different ways here but none of them works. The latter one (except KeyboardInterrupt) actually gets run but the process won't exit after calling sys.exit.

Before this I used the multiprocessing module like this:

try:      
    pool = multiprocessing.Pool(processes=args.num_jobs)
    pool.map_async(perform_download, queue).get(1000000)
  except Exception as e:
    pool.close()
    pool.terminate()
    sys.exit(0)

So what is the proper way to add ability to terminate all worker threads or processes once you hit Ctrl-C in the terminal?

System information:

  • Python version: 3.6.1 32-bit
  • OS: Windows 10
szx
  • 6,433
  • 6
  • 46
  • 67
  • At least for the multiprocessing module, it is a known bug: http://bugs.python.org/issue8296 Also see https://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool – havanagrawal Jul 02 '17 at 03:45
  • this one worked for me.try https://stackoverflow.com/a/31795242/4385319 – technusm1 Jul 02 '17 at 03:56

2 Answers2

0

You are catching the SIGINT signal in a signal handler and re-routing it as a SystemExit exception. This prevents the KeyboardInterrupt exception to ever reach your main loop.

Moreover, if the SystemExit is not raised in the main thread, it will just kill the child thread where it is raised.

Jesse Noller, the author of the multiprocessing library, explains how to deal with CTRL+C in a old blog post.

import signal
from multiprocessing import Pool


def initializer():
    """Ignore CTRL+C in the worker process."""
    signal.signal(SIGINT, SIG_IGN)


pool = Pool(initializer=initializer)

try:
    pool.map(perform_download, dowloads)
except KeyboardInterrupt:
    pool.terminate()
    pool.join()
noxdafox
  • 14,439
  • 4
  • 33
  • 45
0

I don't believe the accepted answer works under Windows, certainly not under current versions of Python (I am running 3.8.5). In fact, it won't run at all since SIGINT and SIG_IGN will be undefined (what is needed is signal.SIGINT and signal.SIG_IGN).

This is a know problem under Windows. A solution I have come up with is essentially the reverse of the accepted solution: The main process must ignore keyboard interrupts and we initialize the process pool to initially set a global flag ctrl_c_entered to False and to set this flag to True if Ctrl-C is entered. Then any multiprocessing worker function (or method) is decorated with a special decorator, handle_ctrl_c, that firsts tests the ctrl_c_entered flag and only if False does it run the worker function after re-enabling keyboard interrupts and establishing a try/catch handler for keyboard interrups. If the ctrl_c_entered flag was True or if a keyboard interrupt occurs during the execution of the worker function, the value returned is an instance of KeyboardInterrupt, which the main process can check to determine whether a Ctrl-C was entered.

Thus all submitted tasks will be allowed to start but will immediately terminate with a return value of a KeyBoardInterrupt exception and the actual worker function will never be called by the decorator function once a Ctrl-C has been entered.

import signal
from multiprocessing import Pool
from functools import wraps
import time

def handle_ctrl_c(func):
    """
    Decorator function.
    """
    @wraps(func)
    def wrapper(*args, **kwargs):
        global ctrl_c_entered
        if not ctrl_c_entered:
            # re-enable keyboard interrups:
            signal.signal(signal.SIGINT, default_sigint_handler)
            try:
                return func(*args, **kwargs)
            except KeyboardInterrupt:
                ctrl_c_entered = True
                return KeyboardInterrupt()
            finally:
                signal.signal(signal.SIGINT, pool_ctrl_c_handler)
        else:
            return KeyboardInterrupt()
    return wrapper

def pool_ctrl_c_handler(*args, **kwargs):
    global ctrl_c_entered
    ctrl_c_entered = True

def init_pool():
    # set global variable for each process in the pool:
    global ctrl_c_entered
    global default_sigint_handler
    ctrl_c_entered = False
    default_sigint_handler = signal.signal(signal.SIGINT, pool_ctrl_c_handler)

@handle_ctrl_c
def perform_download(download):
    print('begin')
    time.sleep(2)
    print('end')
    return True

if __name__ == '__main__':
    signal.signal(signal.SIGINT, signal.SIG_IGN)
    pool = Pool(initializer=init_pool)
    results = pool.map(perform_download, range(20))
    if any(map(lambda x: isinstance(x, KeyboardInterrupt), results)):
        print('Ctrl-C was entered.')
    print(results)
Booboo
  • 38,656
  • 3
  • 37
  • 60