0

I have a number of tasks, which I am running in parallel to finish them in reasonable time.

Following terminology from this answer, my problem belongs to the wide scenario, that is some tasks may finish very quickly and some can run for days. I am using pool.map_async with chunksize = 1

When almost all tasks are finished, some of the processes become idle. And they still keep the system memory. Is it possible to free that memory, or close those processes?

My ideas:

  • I tried to free heavy variables that were created in each task before finishing it. It decreased memory taken from 1.6% to 1.3%.
  • One work-around would be to add some dummy tasks at the end of iterable but it doesn't seem as the right solution. I haven't tried it yet.

P.S. It is also surprizing to me that idle processes still consume CPU. I noticed it takes quite a lot of time for the %CPU to go from 99% to a value close to 0.

piogor
  • 59
  • 1
  • 2
  • 7
  • *almost all tasks are finished* - so you were not waiting until the pool has closed/joined ? – RomanPerekhrest Sep 01 '23 at 12:53
  • 1
    either build your own pool where you can control the lifespan of child processes more finely, or adjust the `maxtasksperchild` parameter of the pool constructor to free each child process after completing a given number of jobs. – Aaron Sep 01 '23 at 17:10
  • @RomanPerekhrest I am/was waiting. But I peek into `ps` to see how it goes. – piogor Sep 02 '23 at 17:08
  • @Aaron Thanks! Using `maxtasksperchild` helped. The idle processes take 0.4% of memory (and 0% CPU very quickly). This is the same amount as the parent process. Still, it is not 0 but at least it is much lower. – piogor Sep 02 '23 at 17:59

0 Answers0