I have a problem where I need to call an instance function of a class in parallel and count the number of times it has been called so each call has a unique identifier (to be used to store results in a unique location).
Here is a question with solutions for what I want but in Java
Here is a minimal example:
para2.py, which sets up all the instance-method pickling stuff (less relevant):
from copy_reg import pickle
from types import MethodType
from para import func
def _pickle_method(method):
return _unpickle_method, (method.im_func.__name__, method.im_self, method.im_class)
def _unpickle_method(func_name, obj, cls):
return cls.__dict__[func_name].__get__(obj, cls)
pickle(MethodType, _pickle_method, _unpickle_method)
func()
And now para.py contains:
from sklearn.externals.joblib import Parallel, delayed
from math import sqrt
from multiprocessing import Lock
class Thing(object):
COUNT = 0
lock = Lock()
def objFn(self, x):
with Thing.lock:
mecount = Thing.COUNT
Thing.COUNT += 1
print mecount
n=0
while n < 10000000:# add a little delay for consistency
n += 1
return sqrt(x)
def func()
thing = Thing()
y = Parallel(n_jobs=4)(delayed(thing.objFn)(i**2) for i in range(10))
print y
Now running python para2.py
in a terminal prints
0
0
0
0
1
1
1
1
2
2
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
I need those numbers on the vertical to count 0 to 9, but it appears that all four processes are still accessing and trying to update COUNT
concurrently. How can I make this do what I want?