6

This code executes on linux but throws an AttributeError: type object 'T' has no attribute 'val' on windows, why?

from multiprocessing import Process
import sys

class T():
    @classmethod
    def init(cls, val):
        cls.val = val

def f():
    print(T.val)

if __name__ == '__main__':
    T.init(5)

    f()

    p = Process(target=f, args=())
    p.start()
shane
  • 1,742
  • 2
  • 19
  • 36

1 Answers1

7

Windows lacks a fork() system call, which duplicates current process. This has many implications, including those listed on the windows multiprocessing documentation page. More specifically:

Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that Process.start was called.

In internals, python creates a new process on windows by starting a new process from scratch, and telling it to load all modules again. So any change you have done in current process will not be seen.

In your example, this means that in the child process, your module will be loaded, but the if __name__ == '__main__' section will not be run. So T.init will not be called, and T.val won't exist, thus the error you see.

On the other hand, on POSIX systems (that includes Linux), process creation uses fork, and all global state is left untouched. The child runs with a copy of everything, so it does not have to reload anything and will see its copy of T with its copy of val.

This also means that Process creation is much faster and much lighter on resources on POSIX systems, especially as the “duplication” uses copy-on-write to avoid the overhead of actually copying the data.

There are other quirks when using multiprocessing, all of which are detailed in the python multiprocessing guidelines.

spectras
  • 13,105
  • 2
  • 31
  • 53
  • 2
    `fork` vs `spawn` is a very old debate. The NT kernel has always had the ability to do a copy-on-write fork, but the Windows API itself, coming out of MS-DOS, doesn't allow it. Also, there are non-trivial problems with forking a multithreaded process, in which case Python 3.4 gives you two more start options in addition to "fork": "forkserver" and "spawn". The latter is the only option on Windows. – Eryk Sun Feb 10 '17 at 08:45
  • 2
    It sure is. I've always found odd no fork system call is available in Win32 despite the underlying functionality being present in the NT kernel. I cannot help but seeing it as Microsoft being reluctant to give in to the old *“Those who don't understand Unix are condemned to reinvent it”* saying. Heck, they waited until W2k8 before finally adding symlinks. I'm ready to bet there will be a `fork()`-equivalent in some future windows version, just give them time. Probably with a name such as `DuplicateCurrentProcess`. – spectras Feb 10 '17 at 09:34
  • 1
    What I mean by not allowing it is the way the system evolved from Windows 2.0 running on DOS just won't work with `fork` -- at least not for an active process. Kernel-mode Windows (win32k.sys) extends the NT `EPROCESS` and `ETHREAD` structures in ways for which `fork` would probably be problematic, such as deadlocking. OTOH, for analyzing a process, we do have the ability to fork an inert [snapshot](https://msdn.microsoft.com/en-us/library/dn457825). – Eryk Sun Feb 10 '17 at 09:48
  • 1
    @spectras: There is no `fork()` equivalent in Windows, because there are unsolved issues around it. It may work perfectly fine in case of a single-threaded process (which was the only kind of process at the time it was invented), but this is not the case when multithreading is factored in. In essence, it increases the number of support calls without providing any added value. Whatever you are using `fork()` for, there is a better way to solve the same problem on Windows (e.g. services vs. daemons). – IInspectable Feb 10 '17 at 10:03
  • @IInspectable> well, the NT kernel supports it and it is available when using other subsystems: [“the Windows kernel has supported “fork” for a long time (going back to earlier POSIX and SFU application support), but it is not exposed in the Win32 programming model”](https://blogs.msdn.microsoft.com/wsl/2016/05/23/pico-process-overview/). Threads could be left aside, on the rationale that a process must choose *one* of fork and threads (basically what happens with POSIX threads). As for why, eryksun's explanation is plausible, though I'd be curious to know the specifics. – spectras Feb 10 '17 at 10:21
  • @spectras: How do you fork a thread that owns a global synchronization object, for example? This cannot work without violating either of two invariants (mutual exclusive ownership, or identical state). At any rate, the only real use-case for `fork()` on Windows is to enable running foreign code (subsystems, including POSIX, or scripting environments). A Windows application does not need `fork()` for anything. – IInspectable Feb 10 '17 at 10:52
  • Windows applications do not use fork because it's not available. With that reasoning, we should never add features: after all current code does not need it. *//* As for the question, there are many ways to handle this. For instance, make the fork fail if current thread owns a mutex. Or state that the child process will not own it. Current thread should know what it's doing, and mutexes owned by other threads are irrelevant because fork() only duplicates current thread in the child process. Child's state is *not* identical, just [very similar](https://linux.die.net/man/2/fork). – spectras Feb 10 '17 at 11:39
  • @spectras: Correct, state is very similar. However, mutexes are part of the things that need to be transferred over as an exact copy. Anyway, I did not say, that Windows applications do not use `fork()`. I explicitly stated, that Windows applications **do not need** `fork()`. There is just no problem that `fork()` solves, for which there isn't at least an equally suitable solution in Windows. If you disagree, name those use-cases, that absolutely require a `fork()` implementation, or would greatly benefit from a `fork()` implementation. – IInspectable Feb 10 '17 at 17:56
  • @IInspectable> there is one in this very question. That can be expanded to any interpreter-based language needing to reload and reparse all files, then execute all setup and initialization code again. Meanwhile, a `fork` would have it usable right away. / One could also argue about the inherent lack of privilege separation in threaded servers, but here is not the right place. – spectras Feb 12 '17 at 14:34