I am using python multiprocessing library to process information within a set of processes. These processes also contain processes that further divide the amount of work that has to be done. There is a single Manager.Queue that accumulates the results from all of the processes that are consuming the data.
In the main thread of the python script. I tried to use the join to block the main thread until we can reasonably determine if all the sub-processes were completed then write the output to a single file. However the system terminates and the file closes before all of the data is written to the file.
The following code is a simplified extraction of the implementation of the above described solution. for queue in inQueues: queue.join()
for p in processes:
p.join()
print "At the end output has: " + str(out_queue.qsize()) + " records"
with open("results.csv", "w") as out_file:
out_file.write("Algorithm,result\n")
while not out_queue.empty():
res = out_queue.get()
out_file.write(res['algorithm'] + ","+res['result']+"\n")
out_queue.task_done()
time.sleep(0.05)
out_queue.join()
out_file.close()
The out_queue.qsize() will print an excess of 500 records available, however only 100 will be printed to the file. Also at this point I am not 100% certain if 500 records are the total generated by the system, but just the number reported at this point.
How do I ensure that all the results are written to the results.csv file?