1

This question is related to: multiprocessing: How do I share a dict among multiple processes?

I have multiple numpy arrays stored in a multiprocessing Dict. The multiprocessing dict is declared and populated with numpy arrays at predefined keys. each subprocess only writes and modifies data in a single key of the dict. The dictionary isnt updated by the subprocesses even thought the subprocesses seem to do something(shouldn't the dictionary be modified "inplace", at the memory location of the dictionary declared in the main process?).

I do not understand why it isn't working; is the data contained in the dict copied to each subprocess then modified in it and not returned to the main process? if that's the case, is there a way to modify the data without copying it somewhere else? In multiprocessing there may be a problem with unwanted data deletion when multiple processes try to write to the same adress, in my case, since each subprocess only writes to a specific key, will this unwanted data deletion be a problem?

Sample code:

    
import datetime
import numpy as np
import random
from multiprocessing import Process,Manager

class nbrgen(object):
    def __init__(self,ticker,TBA,delay):
        self.delay=delay
        self.value=100.00
        self.volume=50
        self.ticker=ticker
        self.TBA=TBA

    def generate_value(self):
        self.value=round (self.value + random.gauss(0,1)*self.delay + 0.01 ,2)
        self.volume=random.randint(1,100)

    def __next__(self):
        return self.next()

    def next(self):
        self.generate_value()
        t=datetime.datetime.now(tz=datetime.timezone.utc)
        return np.array([t,self.ticker,self.TBA,self.value,self.volume])

    def apenddict(D, tik,gnr):
      for i in range(8):
        print(tik)
        D[tik][:-1] = D[tik][1:]
        D[tik][-1:, :] = gnr.next()


    if __name__ =="__main__":
     manager=Manager()
     d=manager.dict()
     d["TOK"] = np.zeros((10, 5), dtype="O")
     d["TIK"] = np.zeros((10, 5), dtype="O")
 
     p1=Process(target=apenddict,args=(d,"TIK",nbrgen("TIK","T",0.1)))
     p2=Process(target=apenddict,args=(d,"TOK",nbrgen("TOK","T",0.1)))

     p1.start()
     p2.start()
     p1.join()
     p2.join()

     print(d)

prints: TIK and TOK randomly (as expected) and

{'TOK': array([[0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0]], dtype=object), 'TIK': array([[0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0]], dtype=object)}

is returned

1 Answers1

0

This is due to the way the Manager() manages the dict. Only the dict itself is managed, not its entries (unless the entries happen to be Proxy Objects). Thus when working with mutable dict entries, updates to the objects are not handled (or even registered) by the manager.

However, it is possible to update the dict itself, e.g. by replacing a full dict entry. Thus, a way to get around the above limitation, is to get a local reference to the entry, update it, and then re-assign the local reference to the managed dict.

In this particular case, it would mean that the appenddict function has to be modified slightly:

def apenddict(D, tik,gnr):
    tik_array = D[tik]  # Create local reference
    for i in range(8):
        print(tik)
        tik_array[:-1] = D[tik][1:]  # Update local reference
        tik_array[-1:, :] = gnr.next()  # Update local reference
    D[tik] = tik_array  # Assign local reference to managed dict

With these changes, your program ought to work more as you expect it to.

JohanL
  • 6,671
  • 1
  • 12
  • 26