5

I have a (large) list with male and female agentes.

I want to apply different functions to each.

How can I use Pool in such a case? Given that the agents are independent of each other.

An example would be:

males = ['a', 'b', 'c']
females = ['d', 'e', 'f']
for m in males:
    func_m(m)
for f in females:
    func_f(f)

I started like that:

from multiprocessing import Pool
p = Pool(processes=2)
p.map() # Here is the problem

I would like to have something like:

p.ZIP(func_f for f in females, func_m for m in males) # pseudocode
Thomas Moreau
  • 4,377
  • 1
  • 20
  • 32
B Furtado
  • 1,488
  • 3
  • 20
  • 34

2 Answers2

2

This is possible to launch the computation asynchronously using map_async. This launches all the job needed and you can then collect them in a single list using the get method on the results.

from multiprocessing import Pool

pool = Pool(4)
res_male = pool.map_async(func_m, males)
res_females = pool.map_async(fun_f, females)

res = res_male.get()
res.extend(res_females.get())

You could also look to the more modern concurrent.futures API which is more intuitive for this kind of computations.

Thomas Moreau
  • 4,377
  • 1
  • 20
  • 32
  • Thanks @thomas-moreau. But what if I don't need to return anything? The idea is just to start internal methods in each agente. – B Furtado Mar 04 '17 at 22:06
  • Hmm. I got an error `An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support()` – B Furtado Mar 04 '17 at 22:14
  • One more thing. The calculation is part of a module that is imported and executed. Do I need if __name__ == "__main__": ? – B Furtado Mar 04 '17 at 23:06
  • Actually, calling two map_async in sequence runs one process than the other. I was wondering if it is possible to Spawn two functions and two iterators simultaneously! – B Furtado Mar 05 '17 at 15:06
  • The `Pool` is designed to run sequentially a list of jobs. If you want to run two processes in parallel, you should maybe consider using 2 `Process`. Another way to go to get an alternance of male/female is using `apply_async` to put the jobs in the right order. Could you explain a bit more your constraints ? – Thomas Moreau Mar 05 '17 at 16:01
  • Thanks. I can join the two functions and run Pool once. I haven't been able to implement the solution as I am using Windows. main.py initiates a process and calls time_iteration.py. In time_iteration a function running_month calls check_demographics where I want speed. I don't know how to use Pool in __main__ as the objects and month data have not been created yet. Thanks a lot. [See question about Windows here](http://stackoverflow.com/questions/42602584/how-to-use-multiprocessing-pool-in-an-imported-module?noredirect=1#comment72337652_42602584) – B Furtado Mar 05 '17 at 16:31
0

Not a great answer, but the first thing that came to mind:

import itertools
f = lambda t: func_m(t[0]) if t[1] else func_f(t[0])
p.map(f, itertools.chain(((0,x) for x in females), ((1,x) for x in males)))
BallpointBen
  • 9,406
  • 1
  • 32
  • 62
  • Thanks. How can I implemente it if in fact I have more than one argument to pass? – B Furtado Mar 03 '17 at 14:37
  • I implemented as `f = lambda t: process_males(mortality_men, my_agents, graveyard, families, firms, year, agent=t[0]) if t[1] \ else process_females(mortality_women, fertility, year, families, my_agents, graveyard, firms, agent=t[0])` But, got an error: `_pickle.PicklingError: Can't pickle . at 0x000000000AF9BF28>: attribute lookup on demographics failed` – B Furtado Mar 03 '17 at 14:58
  • Oh right, `multiprocessing` can't take lambdas... http://stackoverflow.com/questions/4827432/how-to-let-pool-map-take-a-lambda-function – BallpointBen Mar 03 '17 at 14:59