15

This is a newbie question:

A class is an object, so I can create a class called pippo() and inside of this add function and parameter, I don't understand if the functions inside of pippo are executed from up to down when I assign x=pippo() or I must call them as x.dosomething() outside of pippo.

Working with Python's multiprocessing package, is it better to define a big function and create the object using the target argument in the call to Process(), or to create your own process class by inheriting from Process class?

Velimir Mlaker
  • 10,664
  • 4
  • 46
  • 58
user2239318
  • 2,578
  • 7
  • 28
  • 50
  • For the first part of your question, if you want a function to be executed when an object is instantiated then you can place a call to it in the classes `__init__` function. You could also you the [property decorator](http://docs.python.org/2/library/functions.html#property). I am not sure what you are asking in the second part. Could you clarify? –  Jun 18 '13 at 15:47
  • Most often you will invoke class methods via a reference to the object, such as `x.doSomthing()`. You can also use the methods internally as soon as the object is instantiated by having them called from the class `__init__` method. If you want an object's methods to "run as a process", there are several ways to do it. My personal favorite is to subclass from `Process`. I explain one way to do this here: http://stackoverflow.com/questions/15790816/python-multiprocessing-apply-class-method-to-a-list-of-objects/16202411#16202411 – DMH Jun 18 '13 at 15:47

1 Answers1

47

I often wondered why Python's doc page on multiprocessing only shows the "functional" approach (using target parameter). Probably because terse, succinct code snippets are best for illustration purposes. For small tasks that fit in one function, I can see how that is the preferred way, ala:

from multiprocessing import Process

def f():
    print('hello')

p = Process(target=f)
p.start()
p.join()

But when you need greater code organization (for complex tasks), making your own class is the way to go:

from multiprocessing import Process

class P(Process):
    def __init__(self):
        super(P, self).__init__()
    def run(self):
        print('hello')

p = P()
p.start()
p.join()

Bear in mind that each spawned process is initialized with a copy of the memory footprint of the master process. And that the constructor code (i.e. stuff inside __init__()) is executed in the master process -- only code inside run() executes in separate processes.

Therefore, if a process (master or spawned) changes it's member variable, the change will not be reflected in other processes. This, of course, is only true for bulit-in types, like bool, string, list, etc. You can however import "special" data structures from multiprocessing module which are then transparently shared between processes (see Sharing state between processes.) Or, you can create your own channels of IPC (inter-process communication) such as multiprocessing.Pipe and multiprocessing.Queue.

Velimir Mlaker
  • 10,664
  • 4
  • 46
  • 58
  • 5
    Doesn't multiprocessing require `if __name__ == "__main__"` if you're running on Windows? Just another surprise to note – Wayne Werner Jan 13 '15 at 00:08
  • If the spawned process changed some of the data types in its class would those changes be visible from the master process? – Woody1193 Dec 06 '16 at 03:54
  • @Woody1193, not for built-in data types. But if you use special shared datatypes from the `multiprocessing` module, you will get the desired effect. (I added this explanation to my answer above.) – Velimir Mlaker Dec 06 '16 at 17:42