I've been reading up on Python's "multiprocessing", specifically the "Pool" stuff. I'm familiar with threading but not the approach used here. If I were to pass a very large collection (say a dictionary of some sort) to the process pool ("pool.map(myMethod, humungousDictionary)") are copies made of the dictionary in memory and than handed off to each process, or does there exist only the one dictionary? I'm concerned about memory usage. Thank you in advance.
Asked
Active
Viewed 870 times
0
-
not sure why you are going to pass a dictionary to the map method, but the dictionary will be casted to an iterable of keys and a portion of keys will be passed to each subprocess. – newtover Jun 19 '17 at 19:06
-
It was a contrived example. I just wanted to illustrate the case where multiple processes are trying to process a large data set. This is actually for a numerical application, so the example could have been given with a very large numPy array or something like that. I'm just concerned that under the hood the data set is going to be duplicated for each process that needs to access it. – LKeene Jun 19 '17 at 19:36
1 Answers
1
The short answer is: No. Processes work in their own independent memory space, effectively duplicating your data.
If your dictionary is read only, and modifications will not be made, here are some options you could consider:
- Save your data into a database. Each worker will read the data and work independently
- Have a single process with a parent that spawns multiple workers using
os.fork
. Thus, all threads share the same context. - Use shared memory. Unix systems offer shared memory for interprocess communication. If there is a chance of racing, you will need semaphores as well.
You may also consider referring here for deeper insight on a possible solution.

cs95
- 379,657
- 97
- 704
- 746
-
Thank you Coldspeed. Can shared memory be used on Windows (and MacOSX)? – LKeene Jun 19 '17 at 20:02
-
@LKeene You may want to take a look at this: https://docs.python.org/2/library/mmap.html... hopefully cross platform. – cs95 Jun 19 '17 at 20:04