1

I have a loop that I've been trying to speed up. I noticed that Python was only using a single core, so I imported the multiprocessing package and set up a pool. Now the whole process is distributed over a number of cores but they seem to be limited to ~10 %.

Is this expected/optimal? Or is there a way to utilise a more from each core?

htop screenshot

Code:

from multiprocessing.dummy import Pool as ThreadPool

//...more code here...

pool = ThreadPool(os.cpu_count())
pool.starmap(getSubject, zip(range(1, Nsub)))
pool.close()
pool.join()

ps. Before using Pool htop would show one core at 100 % and the others at ~0 %.

collector
  • 940
  • 1
  • 8
  • 14

1 Answers1

1

multiprocessing.dummy.Pool is a thread pool. Due to GIL you won't use multiple cores to the fullest. Change it to multiprocessing.Pool if you wish to use processes instead.

Note that the total usage of cores also depends on what exactly getSubject does. If you are doing some synchronization (e.g. locking) then you might see loss of performance as well.

freakish
  • 54,167
  • 9
  • 132
  • 169