1

I am trying to implement a shared counter in multiprocessing. I am using a global variable to get around pickling problems. For reasons I do not understand, the increment does not seem to apply on my global counter list (the value is always 0). I guess it the code is using a local instance of the variable that gets discarded.

I thought that global lists can be modified, because they are mutable. I also tried explicitly defining global list_global to specify that I want to use the global definition of the variable.

Can someone please point out my error?

from multiprocessing import Pool, Value, Lock

list_global = [] # global variable to hold Counter values


#http://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing/
class Counter(object):    
    def __init__(self, initval=0):
        self.val = Value('i', initval)
        self.lock = Lock()
##        self.val = initval

    def increment(self):
        with self.lock:
            self.val.value += 1

    def value(self):
        with self.lock:
            return self.val.value


def process_item(x):
    global list_global
    list_global[0].increment()     # increments
    return list_global[0].value()  # correctly returns incremented value


def main():
    global list_global

    print 'before', list_global[0].value()

    pool = Pool()
    print pool.map(process_item, range(10))
    pool.close()
    pool.join()

    #increments in process_item are not persistent
    #(do not appear to be modifying the global variable)
    print 'after', list_global[0].value()  #=> 0


# list_global holds 3 Counter objects
for i in range(3):
    list_global.append(Counter(0))


if __name__ == '__main__':
    global list_global
    main()
    #print list_global # list_global holds "Counter" objects
    for i in list_global:
        print i.value(), #=>[0,0,0] # expected [10,0,0]
Roberto
  • 2,054
  • 4
  • 31
  • 46
  • You're not "working around" pickling problems by using a global variable. That's not how it's done. Your current implementation is flawed. – g.d.d.c Jun 17 '14 at 19:38
  • You *really* wasn't able to find the question [Python multiprocessing and a shared counter](http://stackoverflow.com/questions/2080660/python-multiprocessing-and-a-shared-counter), not even when typing the title of your question? (it pops up as first result...) Please, *search* before asking a question. – Bakuriu Jun 17 '14 at 20:32
  • @Bakuriu - I had seen that example. I thought my implementation was different because I used shared memory `Value` objects. This did not work out as I had hoped, but it was not so obvious to me (before asking) that the answer was a duplicate. Sorry for the confusion. – Roberto Jun 17 '14 at 20:47

1 Answers1

2

More specific than my comment, your problem is that you're fundamentally misunderstanding what multiprocessing is doing, which produces faulty expectations for your output. You can't declare a global variable and then share it across multiple processes. You can get away with a little bit more when you use Threads, but in order to understand why you're having trouble, you need to realize what multiprocessing is doing.

When Pool.map() fires up your child processes, each of them launches its own python interpreter where it imports your top level function process_item. This separate interpreter instance also creates its own instance of list_global. That happens for every child process. Your calls to global don't just magically make those separate running processes share a list defined in your module.

g.d.d.c
  • 46,865
  • 9
  • 101
  • 111
  • very interesting. Thanks for taking the time to explain. Obviously, I have some more reading to do. – Roberto Jun 17 '14 at 20:10