I have written a bioinformatics python program which makes heavy use of python's multiprocessing package. I see discrepancies between the memory used by child processes when run on MacOSX and Linux systems. MacOSX uses less memory.
When I profile the memory of the child processes running on each system I see a pronounced difference across the platforms. I profile each process when it begins and ends as follows (based on this SO answer, Note: MacOSX reports the memory useage of the process as Bytes and Linux reports as Kilobytes):
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Linux reports that each Process requires 1GB whereas MacOSX reports that each job takes roughly 300MB. Whatsmore, MacOSX seems to start small and grows over the course of the Process whereas Linux starts and stays around 1GB.
So my questions:
Does this have something to do with the way either platform handles forking? Perhaps MacOSX spawns a new process whereas Linux forks by default. I am using Python 2.7 so I can't control the start method of processes (I think).
Am I right in thinking that this is a forking issue? Has anyone else come across this problem? How can I control the memory usage in Linux?