0

consider the following objects:

class Item(object):
    def __init__(self):
        self.c = 0
    def increase(self):
        S.increase(self)

class S(object):
    @staticmethod
    def increase(item):
        item.c += 1

This mirrors the situation I am currently in, S is some library class, Item collects and organises data and data manipulation processes. Now I want to parallelise the work, for that I use the python multiprocessing module:

from multiprocessing import Process
l= [Item() for i in range(5)]
for i in l:
    Process(target=i.increase).start()

The result is not what I expected:

[i.c for i in l]
[0, 0, 0, 0, 0]

Where am I going wrong?

Kai
  • 376
  • 1
  • 4
  • 17
  • This has nothing to do with the use of static methods, but rather with the way the multiprocessing module works. When you start a new `Process` it gets a *copy* of each object `i`. See http://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python?rq=1 for instance. To obtain a mutated object back you must either send it back from the Process, or put it in a shared area: https://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes – torek Jun 20 '16 at 12:08
  • This seems to be the issue, if you do post it as an answer, I can give you the mark! Thank you a lot! – Kai Jun 20 '16 at 12:54

1 Answers1

1

You're expecting your mutator, the static method increase in class S (called from the non-static increase in class item) to adjust each i.c field—and it does. The problem is not with the static method, but rather with the internal design of multiprocessing.

The multiprocessing package works by running multiple separate instances of Python. On Unix-like systems, it uses fork, which makes this easier; on Windows-like systems, it spawns new copies of itself. Either way, this imposes all the slightly odd restrictions described in the Python documentation: v2 and v3. (NB: the rest of the links below are to the Python2 documentation since that was the page I still had open. The restrictions are pretty much the same for both Python2 and Python3.)

In this particular case, each Process call makes a copy of the object i and sends that copy to a new process. The process modifies the copy, which has no effect on the original.

To fix this, you may either send the modified objects back, e.g., through a Queue() or Pipe() instance, or place the objects into shared memory. The send-back technique is simpler and easier to program and automatically does most of the necessary synchronization (but see the caveat about being sure to collect all results before using a Process instance's join, even implicitly).

torek
  • 448,244
  • 59
  • 642
  • 775