1

I have a machine learning application in Python. And I'm using the multiprocessing module in Python to parallelize some of the work (specifically feature computation).

Now, multiprocessing works differently on Unix variants, and Windows OS.
Unix (mac/linux): fork/forkserver/spawn
Windows: spawn
Why multiprocessing.Process behave differently on windows and linux for global object and function arguments

Because of spawn being used on Windows, the launch of multiprocessing processes is really slow. It loads all the modules from scratch for each process on Windows.

Is there a way to speed up the creation of the extra processes on Windows? (using threads instead of multiple processes is not an option)

Vishal
  • 3,178
  • 2
  • 34
  • 47

1 Answers1

1

Instead of creating multiple new processes each time, I highly suggest using concurrent.futures ProcessPoolExecutor and leaving the executor open in the background.

That way, you don't create a new process each time, but rather leave them open in the background and pass some work using the module's functions or queues and pipes.

Bottom line - Don't create new processes each time. Leave them open and pass work.

Bharel
  • 23,672
  • 5
  • 40
  • 80