0

Hi I don't feel like I have quite understood multiprocessing in python correctly.

I want to run a function called 'run_worker' (which is simply code that runs and manages a subprocess) 20 times in parallel and wait for all the functions to complete. Each run_worker should run on a separate core/thread. I don' mind what order the processes complete hence i used async and i dont have a return value so i used map

I thought that I should use:

if __name__ == "__main__":
    num_workers = 20
    param_map = []
    for i in range(num_workers):
        param_map += [experiment_id]
        
    pool = mp.Pool(processes= num_workers)
    pool.map_async(run_worker, param_map)
    
    pool.close()
    pool.join()

However this code exits straight away and doesn't appear to execute run_worker properly. Also do I really have to create a param_map of the same experiment_id to pass to the worker because this seems like a hack to get the number of run_workers created. Ideally i would like to run a function with no parameters and no return value over multiple cores. 

Note I am using windows 2019 server in AWS.

edit added run_worker which calls a subprocess which write to file:

def run_worker(experiment_id):
    hostname = socket.gethostname()
    experiment = conn.experiments(experiment_id).fetch()

    while experiment.progress.observation_count < experiment.observation_budget:
        suggestion = conn.experiments(experiment.id).suggestions().create()
        value = evaluate_model(suggestion.assignments)
        conn.experiments(experiment_id).observations().create(suggestion=suggestion.id,value=value,metadata=dict(hostname=hostname),)
        # Update the experiment object
        experiment = conn.experiments(experiment_id).fetch()
azuric
  • 2,679
  • 7
  • 29
  • 44

1 Answers1

0

It seems that for this simple purpose you can better be using pool.map instead of pool.map_async. They both run in parallel, however pool.map is blocking until all operations are finished (see also this question). pool.map_async is especially meant for situations like this:

result = map_async(func, iterable)
while not result.ready():
    // do some work while map_async is running
    pass

// blocking call to get the result
out = result.get()

Regarding your question about the parameters, the fundamental idea of a map operation is to map the values of one list/array/iterable to a new list of values of the same size. As far as I can see in the docs, multiprocessing does not provide any method to run multiple functions without parameters.

If you would also share your run_worker function, that might help to get better answers to your question. That might also clear up why you would run a function without any arguments and return values using a map operation in the first place.

  • So in short: do `res = pool.map_async(run_worker, param_map)` and then `res.get()` or just use `pool.map(run_worker, param_map)`. Both should do the trick, please let us know whether that works for you. Also, there might still be issues with your run_worker function. That is not deductible from the code you provided. – Gilles Ottervanger Apr 10 '21 at 13:02
  • So it doesn't work on WIndows i suspect something else is the issue here even with the if main statement as per mutiprocessor isntructions. The run_worker works fine but this spawns only 1 version of this function. – azuric Apr 10 '21 at 13:42
  • Note that the fact that a single call or several sequential calls to `run_worker` work as expected, does not guarantee it will work in parallel. Your function seems to rely on shared resources (`socket` and `conn`). Are you sure all operations are thread-safe? – Gilles Ottervanger Apr 10 '21 at 15:41
  • yes in fact this code is an optimizer built for parallel operations which works fine in a linux env but appears to have issues on windows. Atm the best solution I can think of is to run it on a condor grid and let that manage the jobs instead of multiprocessor. – azuric Apr 11 '21 at 11:18
  • This information completely changes the nature of the question. Please next time provide these kinds of details in your initial question. – Gilles Ottervanger Apr 12 '21 at 10:52