0

My task is like

from sklearn.gaussian_process import GaussianProcessRegressor

num = 100
model = dict()

for i in range(100):
    model[i]=GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=20)

for i in range(num):
    model[i].fit(X,Y)

where X,Y are my training data constaining features and labels, respectively.

My Ubuntu has 4 CPUs. In order to reduce the training time cost to a quarter of the above code, I therefore want to execute model[0].fit(X, Y) on CPU-0, model[1].fit(X, Y) on CPU-1, model[2].fit(X, Y) on CPU-2 and model[3].fit(X, Y) on CPU-3, simultaneously. What should I do?

guorui
  • 871
  • 2
  • 9
  • 21
  • What if you divide the range of the loop into 4 groups? E.g. `for i in range(25)`, `for i in range(25,50)`, `for i in range(50,75)` and `for i in range(75, 100)`. – YusufUMS Apr 26 '19 at 02:55
  • @Yusufsn Could you please post your (pseudo) code and I would like to run it. – guorui Apr 26 '19 at 03:02
  • Use `multiprocessing` module's `pool` and `map` function. The minimal example is at document page. – hunzter Apr 26 '19 at 03:24
  • perhaps this whould help you https://stackoverflow.com/questions/41588383/how-to-run-keras-on-multiple-cores – hossein hayati Apr 26 '19 at 03:33
  • @hunzter I don't find where is that. Can you show me the link? Thanks. – guorui Apr 26 '19 at 06:33
  • @hunzter I have been trying using `pool` and `map` since yesterday but make a little progress. – guorui Apr 26 '19 at 06:35

1 Answers1

-1

Replace input_x and input_y with your actual training data in a list.

input_x=[X for i in range(100)]
input_y=[Y for i in range(100)]

def trainmodel(X,Y):
    model=GaussianProcessRegressor(n_restarts_optimizer=20)
    model.fit(X,Y)
    return model

models=joblib.Parallel(n_jobs=4,verbose=1)(map(joblib.delayed(trainmodel),input_x,input_y))

You should also check the number of cpu available just in case

import multiprocessing
multiprocessing.cpu_count()
Nic Wanavit
  • 2,363
  • 5
  • 19
  • 31