0
import multiprocessing

def test(d):

    a = {}
    for i in range(10):
        a[i] = i +1
    print(a)
    d = a


if __name__ == "__main__":

    manager = multiprocessing.Manager()
    d = manager.dict()

    p = multiprocessing.Process(target = test, args = (d, ))
    p.start()
    p.join()
    print(d)

I try to create a dictionary in def test(d) with multiprocessing manager. Why is a the desired dict, but d is an empty dict, although d = a?

Phil
  • 51
  • 5
  • 2
    Mandatory link to [Ned Batchelder](https://nedbatchelder.com/text/names.html) – quamrana Oct 21 '22 at 13:47
  • Oh that means `d` refers to the dict created in the `test` function now? So `d` is just a usual dictionary now and not a shareable manager object anymore? – Phil Oct 21 '22 at 13:57
  • 1
    Yes, but just the `d` parameter inside the `test` function. The `d` in `"__main__"` is still the empty `manager.dict()` you created there. What are you trying to achieve? – quamrana Oct 21 '22 at 13:59
  • instead of doing `d = a`, try doing `d.update(a)` in `test` – Charchit Agarwal Oct 21 '22 at 14:03
  • I try to fetch data from an API request and use `a` as a temporary dict. Then I want to replace the old `manager.dict()` with `a`. I could clear `d` and put the new data from the API request directly in `d`, but in that case there is a small time frame when `d` is incomplete (since the data amount is huge, it takes a few ms to put everything in `d`). My idea now is to still use `a` and instead of `d = a`, I use `d.clear()` + `d.update(a)`. That should work right? – Phil Oct 21 '22 at 14:46
  • Yes, that sounds right. There will be an amount of time needed to perform the `d.update(a)`, but I think that the `manager` will use a write lock to ensure that no other process can see it. – quamrana Oct 21 '22 at 15:20
  • @quamrana The `manager` and its objects exist in a separate process. When a method is invoked on a proxy for such an object, the method and its arguments are sent to the process where the operation on the actual object will be performed in a thread. The issue then is whether the `update` method is thread-safe and it is due to the GIL needing to be locked before any `dict` method can be executed. But this locking is done by the Python interpreter and not by the `manager`. – Booboo Oct 21 '22 at 16:05
  • @Booboo: Well, I looked at this [answer](https://stackoverflow.com/a/47875528/4834) to a question which implied there was a lock. Do you have any insight? – quamrana Oct 21 '22 at 16:34
  • @quamrana There is one GIL per process. So if you have 5 `dict` instances within the manager's process being updated by 5 other processes, the actual updates are occurring within that manager's process in 5 threads, each trying to lock the same GIL. That is why the updates do not occur in parallel even though there would be no problem if they did since they are separate `dict` instances. This is the problem with multithreading and one reason why it is best to avoid managed objects in addition to the remote method call overhead. So the lock in question is the GIL, not any manager-specific lock. – Booboo Oct 21 '22 at 16:56

0 Answers0