0

Lets say I have 10000 tasks at hand. How can I process them in parallel, running precisely 8 processes at any time? The moment a task is finished, the next task should be fetched for execution immediately.

for e in arr:
   pr=Process(target=execute, args=(q,e))
   pr.start()
   pr.join()

I want to do this because my CPU has only 8 hardware threads. Swarming it with 10000 tasks at once will slow down the overall computation due to the switching overhead. My memory is also limited.

(Edit: This is not a duplicate of this question as I am not asking how to fork a process.)

Chong Lip Phang
  • 8,755
  • 5
  • 65
  • 100
  • 3
    Look at the [Pool classes in the docs](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool) – President James K. Polk Nov 12 '18 at 05:11
  • 1
    Possible duplicate of [How to process a list in parallel in Python?](https://stackoverflow.com/questions/51814897/how-to-process-a-list-in-parallel-in-python) – U13-Forward Nov 12 '18 at 05:13
  • I don't think it is a duplicate of that question, as I am not asking how to fork a process. Anyway, Pool is probably the solution to my problem. Thanks, James! – Chong Lip Phang Nov 12 '18 at 05:17

2 Answers2

0

I think if you split the "for" loop for join statement your problem might be solved. Right now you start a fork and want the result to come back and go do another fork process. And no fork is closed right now.

for e in arr:
   pr=Process(target=execute, args=(q,e))
   pr.start()

for e in arr: 
   pr.join()

Or just go with pool and map functions.

0

For Pool to work here I need to call get() too.

from multiprocessing import Pool
pl=[]
pool = Pool(8)
for e in arr:
   pl.append(pool.apply_async(execute, (e))
for pl2 in pl: pl2.get() 
Chong Lip Phang
  • 8,755
  • 5
  • 65
  • 100