multiprocessing fork() vs spawn()

Question

I was reading the description of the two from the python doc:

spawn

The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process objects run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver. [Available on Unix and Windows. The default on Windows and macOS.]

fork

The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic. [Available on Unix only. The default on Unix.]

And my question is:

is it that the fork is much quicker 'cuz it does not try to identify which resources to copy?
is it that, since fork duplicates everything, it would "waste" much more resources comparing to spawn()?

fork is fast due to copy-on-write. spawn needs to build an entirely new process — anthony sottile, Sep 28 '20 at 05:02

score 89 · Answer 1 · answered Feb 09 '21 at 04:38

There's a tradeoff between 3 multiprocessing start methods:

fork is faster because it does a copy-on-write of the parent process's entire virtual memory including the initialized Python interpreter, loaded modules, and constructed objects in memory.

But fork does not copy the parent process's threads. Thus locks (in memory) that in the parent process were held by other threads are stuck in the child without owning threads to unlock them, ready to cause a deadlock when code tries to acquire any of them. Also any native library with forked threads will be in a broken state.

The copied Python modules and objects might be useful or they might needlessly bloat every forked child process.

The child process also "inherits" OS resources like open file descriptors and open network ports. Those can also lead to problems but Python works around some of them.

So fork is fast, unsafe, and maybe bloated.

However these safety problems might not cause trouble depending on what the child process does.
spawn starts a Python child process from scratch without the parent process's memory, file descriptors, threads, etc. Technically, spawn forks a duplicate of the current process, then the child immediately calls exec to replace itself with a fresh Python, then asks Python to load the target module and run the target callable.

So spawn is safe, compact, and slower since Python has to load, initialize itself, read files, load and initialize modules, etc.

However it might not be noticeably slower compared to the work that the child process does.
forkserver forks a duplicate of the current Python process that trims down to approximately a fresh Python process. This becomes the "fork server" process. Then each time you start a child process, it asks the fork server to fork a child and run its target callable.

Those child processes all start out compact and without stuck locks.

forkserver is more complicated and not well documented. Bojan Nikolic's blog post explains more about forkserver and its secret set_forkserver_preload() method to preload some modules. Be wary of using an undocumented method, esp. before the bug fix in Python 3.7.0.

So forkserver is fast, compact, and safe, but it's more complicated and not well documented.

[The docs aren't great on all this so I've combined info from multiple sources and made some inferences. Do comment on any mistakes.]

If I want to use "fork" with multithreaded program including theading.Lock objects, would it be a good idea to create additional processes at the beginning of the main process execution? Would this make the "fork" option safe (e.g. prevent the "stuck in child while locked" issue for locks + all other issues assumming that processes are created before any other imports/instructions)?. — michalmonday, Dec 02 '21 at 21:22
@michalmonday the "fork" option is safer if the parent process is single-threaded when it forks child processes. So yes, fork additional (child) processes early on, before starting additional threads. I'm not aware of any other safety issues with "fork." — Jerry101, Dec 03 '21 at 01:43
fork() doesn't cause bloat even if the modules are not used. The memory that these modules occupy are shared with the parent process because fork() does copy-on-write, so they don't cost any more memory that you aren't already using if the modules not used by the child process. — Lie Ryan, Dec 16 '21 at 04:43
@LieRyan indeed if those pages don't get used, they won't cost RAM space but they'll add to the child process's address space which might get it closer to the Out Of Memory killer. Also, adding/dropping a reference to any Python object in those pages will update its reference count, thus needing to copy its page(s). Python's cycle-detecting GC might need to scan those pages, thus swapping them into RAM and costing GC work as well. — Jerry101, Dec 16 '21 at 20:46
@Jerry101 If reference count needs to be updated, then yes, the pages might need to be copied, but that just means the module is actually used. The multiprocessing spawn method on the other hand always make copies of the module whether or not the module is used. Despite refcount and GC, fork still have a lot less that need to be copied than when using spawn. — Lie Ryan, Dec 17 '21 at 00:09
Multiprocessing spawn is not like subprocess spawn. With subprocess spawn, you're spawning a different Python program, which can have a different (and hopefully smaller) list of loaded modules. But with multiprocessing spawn, the initialisation would preload all modules that are loaded in the main process, so it's always more bloated than fork. — Lie Ryan, Dec 17 '21 at 00:14
@LieRyan _"it's always more bloated than fork"_ If you put some imports - those needed only in parent - behind a `if __name__ == '__main__'` barrier then the `spawn` method should be less bloated than the `fork`, am I right? — Jeyekomon, Nov 04 '22 at 15:18

score 15 · Accepted Answer · answered Sep 28 '20 at 05:25

15

is it that the fork is much quicker 'cuz it does not try to identify which resources to copy?

Yes, it's much quicker. The kernel can clone the whole process and only copies modified memory-pages as a whole. Piping resources to a new process and booting the interpreter from scratch is not necessary.

is it that, since fork duplicates everything, it would "waste" much more resources comparing to spawn()?

Fork on modern kernels does only "copy-on-write" and it only affects memory-pages which actually change. The caveat is that "write" already encompasses merely iterating over an object in CPython. That's because the reference-count for the object gets incremented.

If you have long running processes with lots of small objects in use, this can mean you waste more memory than with spawn. Anecdotally I recall Facebook claiming to have memory-usage reduced considerably with switching from "fork" to "spawn" for their Python-processes.

answered Sep 28 '20 at 05:25

Darkonaut

20,186
7
54
65

what is the by default ? Spawn or Fork – Kimi Sep 28 '20 at 07:26
1

@Kimi spawn: Windows, Python 3.8+ on macOS; fork: Unix including macOS with Python<3.8 – Darkonaut Sep 28 '20 at 07:34
For Docker env - Python 3.8+ , Unix , I have not used the get_context() , so the default value is None and it returns self. Which Means It is using Spawn ? – Kimi Sep 28 '20 at 08:10
1

@Darkonaut thanks! but why would "lots of small objects" cause more waste on memory? I thought since the "object unit" is small, the copy can be more specific? Or is it because the minimum copu unit is not object but page, and one change on the small object would cause the whole page to be copied which includes lots of duplicate small object? – Crystina Sep 28 '20 at 08:21
1

@Crystina The latter, right. It's also that your child-process ends up getting copies of pages it doesn't actually need for its task, just because the parent-process does something with completely unrelated objects. – Darkonaut Sep 28 '20 at 08:34
1

@Kimi Unfortunately I don't know how multiprocessing behaves with Docker. Consider asking a separate question with the `Docker`-tag. – Darkonaut Sep 28 '20 at 08:36
-1 Python docs: "Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725." – lemi57ssss Jun 09 '21 at 12:23
@lemi57ssss See my first comment. Also this question was neither about macOS nor stability, but about resources. – Darkonaut Jun 09 '21 at 12:41

multiprocessing fork() vs spawn()

2 Answers2

Linked