2

I use sklearn.grid_search.GridSearchCV in parallel with several cpus/cores. Calling the fit method creates several copies (one for each process) of my data. That causes my processes to crash due to memory limitations.

Is there a way to prevent the function from copying the data for each process? Can I use shared memory for all cores?

Ohumeronen
  • 1,769
  • 2
  • 14
  • 28

1 Answers1

1

python by default creates a new process for each parallel task. This new process copies the data. I would recommend using the multiprocess shared environment to avoid this. You can see an example in https://github.com/alvarouc/polyssifier/blob/master/polyssifier/polyssifier.py#L87