5

I noticed that I can access functions and modules in a child process that were outside of the child process function/target. So I'm wondering when I create a child process in python does it copy everything from current process? Why is it I can access functions and imported modules that are outside the child target?

from multiprocessing import Process, Pipe

def test1():
         return "hello"

def simpleChildProcess( childPipe ):
       # simpleChildProcess can access test1 function
       foo = test1()
       childPipe.send( foo )

parentPipe, childPipe = Pipe()
childProcess = Process( target=simpleChildProcess, args=(childPipe,) )

childProcess.start()

print "Pipe Contains: %s" % parentPipe.recv()
Delgan
  • 18,571
  • 11
  • 90
  • 141
ThunderJack
  • 63
  • 1
  • 7

1 Answers1

7

On Unix-like OSes, multiprocessing.Process uses os.fork to spawn new processes. fork creates a new process which is a copy of the parent process, and the forked process resumes from the point where fork was called.

Since Windows lacks fork, multiprocessing.Process starts a new Python process and imports the calling module. On Windows, calls to Process have to be inside if __name__ == '__main__' to prevent Process from being called repeatedly every time the calling module is imported. (It's a good practice even on Unix to include if __name__ == '__main__': to prevent your code from causing runaway process-spawning).

Thus, the child process has access to functions and modules that have been defined by the calling module up to the point where Process had been called (in the case of Unix) or after importation of the calling module (in the case of Windows).

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • thank you. I thought this might be the case. wouldn't it be more resource efficient to use a python subprocess to create a child process if I only wanted to run a function? – ThunderJack Aug 19 '19 at 14:48
  • well explained. I encountered many weird scenarios on Windows and this puts it well ordered in my head – Tomerikoo Aug 19 '19 at 14:53
  • @Jack: [Linux uses copy-on-write](https://stackoverflow.com/a/13128386/190597) (though I'm not sure about all Unix-likes). Thus, although forking appears to copy all data from parent to child, a physical copy is not made until the data is modified. So for OSes with copy-on-write, `forking` is not excessively resource intensive. If the parent process uses a lot of resources and the OS does not use copy-on-write, then using `subprocess.Popen` could be less resource intensive. – unutbu Aug 19 '19 at 15:10
  • 1
    note you can instruct `multiprocessing` to spawn new processes (instead of forking) by explicitly [setting a start method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods) – Sam Mason Aug 19 '19 at 15:31
  • @unutbu thank you. I had forgotten all about copy on write, and I do believe most Unix OSs do this. – ThunderJack Aug 19 '19 at 15:47
  • Thank you @SamMason, but I don't see this option list in the docs for python2, although it might be available. – ThunderJack Aug 19 '19 at 15:50
  • @Jack note that Python 2 is officially [retiring in a few months](https://pythonclock.org/) and hence you should be trying to move away from it – Sam Mason Aug 19 '19 at 15:56
  • Apparently OSX while being unix-like (depending on definition) does not seem to fork. Meaning, module variables are not copied to subprocesses as they are on linux. – Andrew Mellinger Feb 01 '21 at 14:58