1

I need to run many cross-validations at once for specific groups of

SVR hyperparamters: ((C_0,gamma_0),(C_1,gamma_1)...(C_n,gamma_n)) and thus, seek for a parallelization method to speed it up.

Maybe it could be possible to run the GridSearchCV so that instead of checking every possible combination of hyperparameters it would check them in 'element wise' manner. Example:

tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
                     'C': [100, 1000]]

clf = GridSearchCV(SVR(), tuned_parameters, cv=5, n_jobs=-1) 

clf.fit(X_train, y_train)

thus in this case only two pairs of hyperparameters would be checked, namely: (1e-3,100) and (1e-4,1000) instead of all the four combinations.

DexzMen
  • 13
  • 5
  • Have you considered 'n_jobs' option? It will not parallelize on the hyperparameters but rather on the number of cross validation per hyperparameters. – Kefeng91 May 12 '18 at 08:42
  • Just eddied the post. I always use n_jobs=-1. However, this do not solve my problem. – DexzMen May 12 '18 at 10:22
  • 1
    It sounds like you are looking not for a way to parallelise processing (that is achieved setting `njobs = N`) but for a way to process a custom set of parameters instead of the full grid. If so, why don't you either run CV yourself by directly looping through `KFold.split()` [see example in the docs](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html) or use [RandomizedSearchCV](http://scikit-learn.org/0.18/modules/generated/sklearn.model_selection.RandomizedSearchCV.html#sklearn.model_selection.RandomizedSearchCV.fit)? – Mischa Lisovyi May 13 '18 at 18:38
  • specifically considering Vivek Kumar's answer, the question might be a duplicate of [this question](https://stackoverflow.com/questions/45352420/avoid-certain-parameter-combinations-in-gridsearchcv) – Quickbeam2k1 May 14 '18 at 06:33
  • @Quickbeam2k1 you are right. I'm sorry. I wasn't able to find this answer. – DexzMen May 14 '18 at 10:51

1 Answers1

1

You can try list of dicts to specify the params.

Something like this:

tuned_parameters = [{'kernel': ['rbf'], 
                     'gamma': [1e-3],
                     'C': [100]}, 
                    {'kernel': ['rbf'], 
                     'gamma': [1e-4],
                     'C': [1000]}]

Calling clf.fit() will now search the parameters over both the elements of the parameter list, using all values from one at a time.

So only two combinations will be used: ('rbf', 1e-3, 100) and ('rbf', 1e-4, 1000)

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
  • Great, it works, thank you. Unfortunately I cannot up vote your answer because I have no reputation here. Hopefully somebody will find this answer helpful. – DexzMen May 14 '18 at 10:47