I am trying to perform parallel processing for my requirements, and the code seems to be working as expected for 4k-5k elements in parallel. But as soon as the elements to be processed start increasing, the code processes a few listings and then without throwing any error, the program stops running abruptly.
I checked and the program is not hung, the RAM is available (I have a 16 Gb RAM) and CPU Utilization is not even 30%. Can't seem to figure out what is happening. I have 1 million elements to be processed.
def get_items_to_download():
#iterator to fetch all items that are to be downloaded
yield download_item
def start_download_process():
multiproc_pool = multiprocessing.Pool(processes=10)
for download_item in get_items_to_download():
multiproc_pool.apply_async(start_processing, args = (download_item, ), callback = results_callback)
multiproc_pool.close()
multiproc_pool.join()
def start_processing(download_item):
try:
# Code to download item from web API
# Code to perform some processing on the data
# Code to update data into database
return True
except Exception as e:
return False
def results_callback(result):
print(result)
if __name__ == "__main__":
start_download_process()
UPDATE -
Found the error- BrokenPipeError: [Errno 32] Broken pipe
Trace -
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 125, in worker
put((job, i, result))
File "/usr/lib/python3.6/multiprocessing/queues.py", line 347, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe