0

I'm trying to use Python's multiprocessing module to run an analysis on multiple samples in parallel. I'm using pool.map_async to spawn the function (called crispr_analysis) on a tuple of tuples for arguments (zipped_args). Each tuple within zipped_args is not empty, as that can cause multiprocessing to hang. Upon completion of the pool, it hangs and fails to move on to the rest of the script. I know that crispr_analysis finishes as it creates output files (generated with with statements, so they're closing properly); I can browse said files and they are complete. I never see the debug message for sorting the results, and the program never terminates.

try:
    #   Use map_async and get with a large timeout
    #   to allow for KeyboardInterrupts to be caught
    #   and handled with the try/except
    timeout = max((9999, 600 * len(fastq_list)))
    logging.debug("Setting timeout to %s seconds", timeout)
    res = pool.map_async(crispr_analysis, zipped_args) # type: multiprocessing.pool.MapResult
    pool.close()
    results = res.get(timeout)
except (KeyboardInterrupt, ExitPool) as error: # Handle ctrl+c or custom ExitPool
    pool.terminate()
    # pool.join()
    if isinstance(error, KeyboardInterrupt): # ctrl+c
        sys.exit('\nkilled')
    elif isinstance(error, ExitPool): # My way of handling SystemExits
        sys.exit(error.msg)
    else: # Shouldn't happen, but you know...
        raise
except:
    pool.terminate(); pool.join()
    raise
else:
    pool.join()
try:
    logging.debug("Sorting results into alignments and summaries")
    sort_start = time.time() # type: float
    alignments, summaries = zip(*results) # type: Tuple[Tuple[alignment.Alignment]], Tuple[Dict[str, Any]]
    logging.debug("Sorting results took %s seconds", round(time.time() - sort_start, 3))
except ExitPool as error: # Handle ExitPool calls for single-threaded map
    sys.exit(error.msg)

Does anyone have any idea why multiprocessing is hanging and how I can fix it?

Extra information:

  • I'm using Python 2.7.8 on CentOS 7.3.1611; platform and Python version are not changeable
  • crispr_analysis returns a tuple and dictionary, that are each either empty or have some length based on the inputs
  • I have tried omitting the pool.join() statements, to no avail
  • ExitPool is an error that I throw to stop the entire pool in place ofSystemExits; multiprocessing normally swallows SystemExits, but I want them to bubble up
  • This entire snippet is called from within a function (called main)
  • This analysis program is called from an easy-install entry script, where the entry point is the main function that starts the multiprocessing pool

    #!/usr/bin/python
    # EASY-INSTALL-ENTRY-SCRIPT: 
    'EdiTyper==1.0.0','console_scripts','EdiTyper'
    __requires__ = 'EdiTyper==1.0.0'
    import sys 
    from pkg_resources import load_entry_point
    
    if __name__ == '__main__':
        sys.exit(
            load_entry_point('EdiTyper==1.0.0', 'console_scripts', 'EdiTyper')()
        )
    
MojaveAzure
  • 21
  • 1
  • 3
  • Does the call to `res.get(timeout)` return within the timeout? Is the `crispr_analysis` function doing anything after it closes whatever files are storing the output? – bnaecker Nov 28 '17 at 20:43
  • @bnaecker Yes, the timeout is roughly 64 hours, while all of `crispr_analysis` takes about 6 minutes on this particular computer, running 8 samples simultaneously. After the final write, `crispr_analysis` assembles the results dictionary and tuple, then logs a message that 'analysis has been completed'. The log message comes *after* results assembling, but before the call to `return` (naturally). I see the final log message before the hang, so I know everything is done. I've tried several computers with various numbers of workers, and the hanging still occurs. – MojaveAzure Nov 28 '17 at 22:04
  • Is it possible that one of your processes dies right at the end, without propagating up the stack, as alluded to [here](https://stackoverflow.com/a/24894997/1911852)? I've run into hanging `Pool` with floating point exceptions. – yodavid May 19 '18 at 03:25

0 Answers0