2

I'm working on a project where I parse data from one process into a python dictionary, which is then read by another python process, that creates different views of the data. The data that is parsed come from sensors handled by the worker1 process.

I have come so far, that I can create two processes with a shared dictionary and I can finally add values to it without a problem. However when I want to modify an existing value I hit a brick wall. I have read "many" answers for hours and tried solutions like creating a new object that contains the key. E.g. x = d["key"][1];x = 'newvalue'. Just doesn't work. I run the code in python 3.6, but the issue seems similar in 2.7. Here's the code in a simplified version:

#!/usr/bin/python3
import subprocess,time
from multiprocessing import Process, Manager

def worker1(d):
    d["key"] = []    #Creating empty dict item
    d["key"]+=["hellostack"] #Adding first value
    d["key"]+=[1]    #Adding second value
    print ("from worker1:",d)
    d["key"][1]+=2 #<<<< ISSUE HERE - Modifying second value does nothing
    print ("Now value should be upped!..",d)

def worker2(d):
    while True:
        time.sleep(1)
        print ("from worker 2:",d)

manager = Manager()
d = manager.dict()

worker1 = Process(target=worker1, args=(d,))
worker2 = Process(target=worker2, args=(d,))
worker1.start()
worker2.start()
worker1.join()
worker2.join()

The output I get is:

from worker1: {'key': ['hellostack', 1]}
Now value should be upped!.. {'key': ['hellostack', 1]}
from worker 2: {'key': ['hellostack', 1]}

Anyone? :o)

EDIT: The possible duplicate, doesn't focus on two separate processes, nor does it talk about dictionaries with lists inside. However admittedly very similar and actually the answer led me to an answer. So I will leave this tentative.

Raker
  • 344
  • 2
  • 6
  • 15
  • 2
    Possible duplicate of [How does multiprocessing.Manager() work in python?](https://stackoverflow.com/questions/9436757/how-does-multiprocessing-manager-work-in-python) – Bharel Aug 24 '17 at 20:22
  • To be honest I can see why this would look like a duplicate, especially when schlenk's very nice explanation talks about the same attributes. However I have seen that answer before, and maybe it's just because I'm too newbie, but I can't apply that answer to my problem. That's why I asked the question. – Raker Aug 24 '17 at 21:09

2 Answers2

8

This is a side effect of mutable values and the way multiprocessing syncs your data between processes.

The dict you get from the multiprocessing.Manager is not some kind of shared memory. It is explicitly synced via messages to the subprocesses, when the manager notices the modification!

How does it notice modifications?

Easy: Just overwrite all modifying methods of the dict with a proxy that syncs things to other processes. You can see the list of proxied methods here in the python source code.

Lets see your example worker line by line:

d["key"] = []    #Creating empty dict item

Ok, __setitem__ called, proxy notices the change.

d["key"]+=["hellostack"] #Adding first value

Nearly the same here, first __getitem__, no sync, then __setitem__ to set the new value.

d["key"]+=[1]    #Adding second value

Identical to the previous one, you add first get the item, then set it again.

d["key"][1]+=2 #<<<< ISSUE HERE - Modifying second value does nothing

Ok, __getitem__ gets the list (not a ListProxy!), then modifies it inplace, lists are mutable values. The DictProxy never sees a __setitem__ call and never syncs anything. So worker2 never gets to see this change.

schlenk
  • 7,002
  • 1
  • 25
  • 29
  • Thank you for the explanation. That was very helpful. Especially that the dict isn't shared memory, but is a sync method. I looked at the class / namespaces link you provided, and to be honest I don't know how to apply it. I try some stuff, that would probably make you laugh. Like I saw it had a __len__ so maybe a len(d) would force an update. But I realise that's totally wrong. Anyways wouldn't the value at least be '2' in the worker1 process (which is isn't either) if the problem is a sync issue? I do a print from inside worker1 and worker2, in the example. None of them reflects the change. – Raker Aug 24 '17 at 21:03
2

This is what solved my problem. The line that didn't work is commented out and the new code is explained in commments. Thank you to Schlenk for the nice explanation <3 and thank you to Bharel for linking the possible duplicate <3:

#!/usr/bin/python3
import subprocess,time
from multiprocessing import Process, Manager

def worker1(d):
    d["key"]=[]    #Creating empty dict item
    d["key"]+=["hellostack"] #Adding first value
    d["key"]+=[1]    #Adding second value
    print ("from worker1:",d)
    #d["key"][1]+=2 #<<<< ISSUE HERE - Modifying second value does nothing
    d2 = d["key"] #Creates a COPY of dictionary list
    d2[1]+=1 #Modifies value inside COPY
    d["key"] = d2 #Merges the dictionary key with the COPY
    print ("Now value should be upped!..",d)

def worker2(d):
    while True:
        time.sleep(1)
        print ("from worker 2:",d)

manager = Manager()
d = manager.dict()

worker1 = Process(target=worker1, args=(d,))
worker2 = Process(target=worker2, args=(d,))
worker1.start()
worker2.start()
worker1.join()
worker2.join()
Raker
  • 344
  • 2
  • 6
  • 15