0

I am currently using concurrent.futures.ProcessPoolExecutor in a method of a class as follows

import pylogit

def parallel_fitting(index):
    X = index[0]
    model_spec = index[1]
    zeros = index[2]
    model = pylogit.create_choice_model(data = data,alt_id_col = 'U_ID',
                                  obs_id_col = 'INDEX',choice_col = 'RES',
                                  specification = model_spec,model_type="MNL")

    model.fit_mle(zeros)

    return model

class something(): 

    def hyper_selection(self, X_y):
        #Create specification dictionary
        model_specification = OrderedDict()
        for variable in X.columns[2:]:
            model_specification[variable] = 'all_same'
        zeros = np.zeros(len(model_specification))

        with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
            for _ in range(2):
                executor.submit(parallel_fitting, [X_y, model_specification, zeros]))

The parallel_fitting function runs well sequentially, however, when put in ProcessPoolExecutor, the program keeps running and never completes. the failure point is at the fit_mle line, as without it the program runs fine.

I suspect the pool is reimporting the modules leading to a creation of another ProcessPoolExecutor I am running it on OSX, hence unable to use fork(), is there any way to circumvent it.

Ivan To
  • 67
  • 1
  • 6

0 Answers0