0

I have a a problem where I need to solve thousands of independent nonnegative least squares problem using nnls in scipy. All problems are small about 100x100 matricies. To speed it up I've tried to use the multiprocessing module in python with the Pool class. I get about a factor 2 improvement if I set number of threads in numpy to 1 and use multiprocessing vs using multithreaded numpy and no multiprocessing. But the performance is very unpredictable. For instance, if I move sections of code into a separate function (to make it easier to read) or call the pool.map function in a class method the performance can decrease with 50%. So it seems like the multiprocessing module is too unreliable to be used.

Does anyone know what can cause this behaviour or know of a better alternative to multiprocessing?

AndersG
  • 81
  • 7
  • Welcome to SO. Three could be many reasons for what you're seeing, and the best place to start is probably understanding what's the performance bottleneck. Is it CPU? Is it inter-process communication? Is it something else? Without knowing the answer to these questions, and understanding your application better, it's pretty difficult to provide meaningful insights. – Roy2012 Jul 15 '20 at 17:42
  • I had a similar question a long time ago: [Minimize overhead in Python multiprocessing.Pool](https://stackoverflow.com/q/37068981/6228891). A bottleneck with `multiprocessing.Pool` is that passing large-ish data objects is slow due to the pickle/unpickle roundtrips. Not sure whether a 100x100 matrix qualifies though (25 µs for the roundtrip, 1 ms for `np.linalg.lstsq`). – Han-Kwang Nienhuys Jul 15 '20 at 18:05

0 Answers0