I am currently using concurrent.futures.ProcessPoolExecutor in a method of a class as follows
import pylogit
def parallel_fitting(index):
X = index[0]
model_spec = index[1]
zeros = index[2]
model = pylogit.create_choice_model(data = data,alt_id_col = 'U_ID',
obs_id_col = 'INDEX',choice_col = 'RES',
specification = model_spec,model_type="MNL")
model.fit_mle(zeros)
return model
class something():
def hyper_selection(self, X_y):
#Create specification dictionary
model_specification = OrderedDict()
for variable in X.columns[2:]:
model_specification[variable] = 'all_same'
zeros = np.zeros(len(model_specification))
with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
for _ in range(2):
executor.submit(parallel_fitting, [X_y, model_specification, zeros]))
The parallel_fitting
function runs well sequentially, however, when put in ProcessPoolExecutor, the program keeps running and never completes. the failure point is at the fit_mle
line, as without it the program runs fine.
I suspect the pool is reimporting the modules leading to a creation of another ProcessPoolExecutor
I am running it on OSX, hence unable to use fork()
, is there any way to circumvent it.