4

it seems that I can't modify a global variable in Python when using a function which is called from pprocess. Here is my example:

import pprocess
import time

numbers=[0,0,0,0,0,0,0,0,0,0]

# find system time and store in global variable
def find_time(index):
    global numbers
    x=time.time()
    print "Setting element %s of numbers to %f" % (index, x)
    numbers[index]=x
    return x

# parallel execution of the function
results=pprocess.pmap(find_time, [0,1,2,3,4,5,6,7,8,9], limit=6)

for y in results:
    print '%f' % y

# this list is unchanged
print numbers

# serial execution of the function
for x in [0,1,2,3,4,5,6,7,8,9]:
    find_time(x)

# now it seems to work
print numbers

"numbers" is just a list of zeros, and for the sake of demonstration I'm trying to set each list element to the current system time. When invoked using pprocess this doesn't work, but when I use a simple for loop to call the function then the global variable is changed.

I've spent some time reading about global variables and sincerely hope this isn't a trivial issue. Can anybody explain to me what is going on?

Many thanks,

Enno

mcenno
  • 509
  • 5
  • 18
  • Also note that there is no need for the `global` keyword there. Python will happily mutate a global object even if you haven't defined it as global. You only need `global` if you change the object your variable references via assignment. – mgilson Sep 27 '12 at 13:25

2 Answers2

1

My understanding is that pprocess uses subprocessing under the hood. If that is the case, then each time the function is run, it is effectively a separate process. And so those changes don't show up when your function returns.

You'll probably need to make the list a multiprocessing.Manager.

e.g.

numbers = multiprocessing.Manager().list([0]*10)
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Thanks for your reply. I suppose implementing this is beyond my programming capabilities, so I have now created a detour by writing the data to disk and reading them back in later (this fortunately does not create a bottleneck). – mcenno Sep 27 '12 at 14:28
0

pprocess creates another process. This means it does not share memory with the calling code. Anything the parallel process modifies will be modified in its own memory space, so the calling code's memory space will remain unchanged. That is - they don't share global variables.

You'll have to do all your communication between the two explicitly, via pipes or whatever pprocess offers or sockets etc.

Claudiu
  • 224,032
  • 165
  • 485
  • 680