First I want to note that there are many posts on this topic. E.g, 1, 2, 3.
Here are a few that I've looked to for a solution, 4, 5
Here is the problem, I'm trying to improve the performance of some code using multiprocessing in python. The straightforward approach of running with the primary computation using pool.apply_async improved performance by ~50%, but this is inadequate. Profiling indicated a lot of overhead in forking the processes.
I thought that an implementation that creates a queue for jobs and a fixed number of processes (1/cpu) and a result queue would outperform this. However, its performance is slightly less than the apply_async solution. In this case, the overhead is in putting and getting from the queues. The data is moderate in size.
OK, I think, my data is static, i.e, I don't write to the data after the fork, so I'll make data that I'm pushing through a pipe/queue global! Searching, I think I find support for this idea in other posts, here and elsewhere (see above).
So, I write a simple example, see mp.py, mp_globals.py. It works great! Here is the output for a small case. It uses an mp.pool equal to the number of CPUs - 1.
Psuedo code
serve: there are N of these, they spin on job-que until they get a work-item dictionary, which they apply to a worker.
worker: takes the work-item and does the work, and pushes the result to the result-que.
There is bookkeeping to keep track of how much work, etc.
The problem, I apply this code to the actual problem and it fails because some of the data in the forked context are no longer valid. One item, the args_dict is valid, but a list, another dict, and a class object are either empty or invalid. The example code works with all of these data types. I added prints that show the id(object) of these four items and only the args_dict has the same id value.
To be clear, the global objects are already invalid before any work is done on them. My PManager class works fine when data is being pushed through queues and when objects are global. Before running the workers the code assigns and checks the globals, and they are correct. On entry to the work-code the 3 of 4 values are bad, but the fourth is correct. The test code is only in two files, the project code is in many but the interaction is between three files -- this does seem like it should be a problem.
I tried using gc.freeze() and I get the same result.
All suggestions are welcome.
P.S., I did not try a MP.manager solution as comments I read and my understanding of how it works I think it is inappropriate to this problem.