0

I am using multiprocessing in Python with:

import multiprocessing as mp
all_arguments = range(0,20)

pool = mp.Pool(processes=7) 
all_items = [pool.apply_async(main_multiprocessing_function, args=(argument_value,)) for argument_value in all_arguments]
for item in all_items:
    item.get()

In the above, as far as i am aware, after a worker processor finishes, it moves on to the next value. Is there any way instead to force a 'new' worker processor to be initialed from scratch each time, rather than to re-use the old one?

[Specifically, main_multiprocessing_function calls multiple other functions that each using caching to speed up the processing within each task. All those caches are however redundant for the next item to be processed, and thus am interested in a way of resetting everything back to being fresh].

kyrenia
  • 5,431
  • 9
  • 63
  • 93
  • Side-note: [You're not doing multithreading; the workers are separate processes, not threads.](http://stackoverflow.com/q/200469/364696) – ShadowRanger Oct 19 '16 at 03:33

1 Answers1

2

From the docs:

maxtasksperchild is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. The default maxtasksperchild is None, which means worker processes will live as long as the pool.

Just create the pool as

pool = mp.Pool(processes=7, maxtasksperchild=1)
tdelaney
  • 73,364
  • 6
  • 83
  • 116