0

I have made simple (Python) code example of my common question. For multiprocessing code I need execute def inside of def for each processors. If use only one def (def f) - the result is ok (I can counting variables globally because use manager for it). But if use two level of def (def ff) - result fail. Any change in def ff not apply in def f later.

from multiprocessing import Process, Manager
import os

def ff(b):
    b = b +1
    print('def ff b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()

def f(a):
    b = 0
    ff(b)
    a.value = a.value + b
    print('def f a = ', a.value, ' b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()

if __name__ == '__main__':
    #a = ()
    manager = Manager()
    a = manager.Value('i', 0)
    p = Process(target=f, args=(a,))
    p.start()
    p.join()
    print('Main, a = ', a.value)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())

this is result

def ff b =  1
parent process: 12312
process id: 2320

def f a =  0  b =  0
parent process: 12312
process id: 2320

Main, a =  0
parent process: 21296
process id: 12312

My expectation:

def f return b = 1 and a = 1
Main return a = 1

What I made wrong? How to make variables inside of processing Global?

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
DmitriyN
  • 27
  • 5
  • What programming language are you using? Please edit your question and add it to tags, so that others can find it. – VLL Dec 14 '22 at 12:32
  • Does this answer your question? [Using global variables in a function](https://stackoverflow.com/questions/423379/using-global-variables-in-a-function) – Ahmed AEK Dec 14 '22 at 12:32
  • this post work for one processing code. But when I use multiprocessing, it don't work as expected. – DmitriyN Dec 14 '22 at 12:50
  • It's not "expected" for changes to globals to be copied from child processes to be copied to the parent, or copied from one child to another. They're separate copies of your program! Of _course_ they can't see each others' variables (except for those specific items, like return values, where multiprocessing is explicitly being asked to serialize specific data, copy it across process boundaries and deserialize it in the other end). What did you think "multiprocessing" meant? – Charles Duffy Dec 14 '22 at 13:44

3 Answers3

1

You expected b to become 1 in f, and therefore a to become 1 in the parent. Your problem has nothing to do with multiprocessing or globals, you've just misunderstood the argument passing conventions of Python. The issue you're having is that you can't mutate b in f through changes in a function, ff, it's passed to; ints are immutable, and you can't pass a reference to a name to a function such that the caller's name can be rebound.

Fixing your code is trivial; instead of trying to do C++-style pass-by-reference to achieve the change to b (which Python can't do), you need to return the new value (comments on changed/added lines):

def ff(b):
    b = b +1
    print('def ff b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()
    return b   # Return new b

def f(a):
    b = 0
    b = ff(b)  # Assign returned value back to b
    a.value = a.value + b
    print('def f a = ', a.value, ' b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()

That's it. All the rest of what you did works.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Ohh.. I got your point. But I found other solution in previous answer. Maybe not so good as yours. Thank you. – DmitriyN Dec 14 '22 at 14:34
0

I think I got where is issue. I declare Global variable incorrectly. The name of global variables can't be same as def arg. Finally I want to have: start 3 async process (3 different CPU) for execute one def (def f). This def will repeat 2 times of new def (def ff). Than I will got final result of all this process.

this is my final code what looks like work well:

import os
from multiprocessing import Manager, Pool

def ff(c):
    global b
    c = c + 1
    b = c
    print('def ff, p.id:', os.getpid(), ', b=', b ,',c=', c)

def f(a):
    global b
    b = 0
    for i in range(2):
        ff(b)
    a.value = a.value + b
    print('def f, p.id:', os.getpid(), ', a=', a.value, ',b=', b)

if __name__ == '__main__':
    manager = Manager()
    a = manager.Value('i', 0)
    cpu = 3
    with Pool(cpu) as pool:
        tasks = []
        for m in range(2):
            task = pool.apply_async(f, args=(a,))
            tasks.append(task)
        for task in tasks:
            task.get()
    print('Main, p.id:', os.getpid(), ', a=', a.value)

Output:

def ff, p.id: 20352 , b= 1 ,c= 1
def ff, p.id: 20352 , b= 2 ,c= 2
def ff, p.id: 29484 , b= 1 ,c= 1
def ff, p.id: 29484 , b= 2 ,c= 2
def f, p.id: 20352 , a= 2 ,b= 2
def f, p.id: 29484 , a= 4 ,b= 2
Main, p.id: 27004 , a= 4
DmitriyN
  • 27
  • 5
  • Using mutable global variables needlessly is usually a bad idea. It breaks thread-safety, sometimes breaks reentrancy, and makes it impossible for two different users of the API to separate their effects on the global. When it's *at all* possible, you want to pass in arguments and return results. When persistent state is needed, you want a class where the state is stored on the instances (so separate consumers in the same program can use separate instances). And they become "unglobal" when `multiprocessing` gets involved. Mutable globals are almost always the wrong solution to any problem. – ShadowRanger Dec 14 '22 at 15:23
-1

My minimal reproductible example

import os

def ff(b):
    b = b +1
    print('def ff b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()


def f(a):
    b = 0
    ff(b)
    a.value = a.value + b
    print('def f a = ', a.value, ' b = ', b)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print()


class A:
    value: int = 0

f(A())

This outputs:

def ff b =  1
parent process: 25356
process id: 8336

def f a =  0  b =  0
parent process: 25356
process id: 8336

Why is that?

Because int are passed by copy instead of by ref. So when calling f(A()), the instance of class A is passed by ref (because an instance of a user defined class is mutable) but when calling ff(b), b is passed by copy so the b = b+1 works only within the function scope not outside.

slouchart
  • 72
  • 6
  • 1
    This misses the bigger (multiprocessing-specific) picture. You _can't_ have a pointer to a different process's memory, so a ref pointing to content owned by a separate process is impossible. Using a mutable type instead of int, the OP would still be unable to use globals for coordination across process boundaries. – Charles Duffy Dec 14 '22 at 13:48
  • Sure. But the OP used an instance of Value in the Manager instance. This should prevent any unrestricted access to the shaed memory. – slouchart Dec 14 '22 at 14:00
  • `int` is passed by object reference, not value, just like any other argument in Python. It's similar to passing in a pointer in C; you can change the thing pointed to, but you can't change the caller's pointer (if you reassign the pointer inside the function, you lose access to the data the caller's pointer referred to). `int`s are immutable though, so you can't modify the object, and it *acts* a lot like passing by value. Mutable types can be changed by in-place mutation (`list_from_args += [1, 2, 3]` would change the caller's `list`), but immutable types just rebind a new object. – ShadowRanger Dec 14 '22 at 14:04
  • @CharlesDuffy: `multiprocessing` has nothing to do with the OP's problem. Their problem is with `b` not changing in `f` (a problem that would occur if `f` were called directly in the main process with no `multiprocessing` involved), and this answer does explain the problem (somewhat inaccurately), even if it doesn't provide a fix. – ShadowRanger Dec 14 '22 at 14:07