I am running into a weird deadlock situation in my code using Python 3.8. The implementation kicks off a separate Process to perform some operations on a PDF/XPS file and then return the results. Occasionally, it will never return, and I am perplexed why it is happening. I cannot show the entire implementation but it is structured like this:
def parent_function():
... (other code)
results_queue = multiprocessing.Queue()
child_process = multiprocessing.Process(target=_process_pdf_pages, args=(original_file, arg1, arg2, arg3,..., results_queue))
child_process.start()
logger.info('BEGIN results = results_queue.get()')
results = results_queue.get()
logger.info('END results = results_queue.get()')
... (other code)
def _process_pdf_pages(original_file, arg1, arg2, arg3,..., results_queue):
try:
logger.info('{} Started reading PDF/XPS file {}'.format(dt.datetime.now(), original_file))
... (other code)
logger.info('{} Finished reading PDF/XPS file {}'.format(dt.datetime.now(), original_file))
... (other code)
logger.info('Child process returning result')
results_queue.put((arg1, arg2, arg3...))
except Exception as e:
logger.error('Child process encountered error: {}'.format(e))
logger.error(traceback.format_exc())
logger.info('Child process returning result')
results_queue.put((arg1, arg2, arg3...))
Whenever this code deadlocks, I see the following lines written in the logs and nothing after this:
2023-06-20 14:12:55 BEGIN results = results_queue.get()
2023-06-20 14:12:55 2023-06-20 14:12:55.075017 Started reading PDF/XPS file my_file.pdf
2023-06-20 14:12:55 2023-06-20 14:12:55.745496 Finished reading PDF/XPS file my_file.pdf
Strangely, I do not see the message the child process prints out just before it calls .put on the queue, but it does appear to be a deadlock because I don't observe any CPU usage indicating the child process was still busy and I know by experience that it only takes a few seconds to process these files.
Is there anything I'm doing wrong with the order of operations that is causing this problem?