Is there a way to output all the cross validation models results in spark scala

Question

I am using cross validation for model and parameter selection in Spark. because of application need, I am not only need to know the best model, but the results for all models. When I worked with python sklearn, I can use

clf = GridSearchCV()
clf.cv_results_

to print out all the models, which is something as following: Grid scores on development set:

0.986 (+/-0.016) for {'C': 1, 'gamma': 0.001, 'kernel': 'rbf'}
0.959 (+/-0.029) for {'C': 1, 'gamma': 0.0001, 'kernel': 'rbf'}
0.988 (+/-0.017) for {'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}
0.982 (+/-0.026) for {'C': 10, 'gamma': 0.0001, 'kernel': 'rbf'}
0.988 (+/-0.017) for {'C': 100, 'gamma': 0.001, 'kernel': 'rbf'}
0.982 (+/-0.025) for {'C': 100, 'gamma': 0.0001, 'kernel': 'rbf'}
0.988 (+/-0.017) for {'C': 1000, 'gamma': 0.001, 'kernel': 'rbf'}
0.982 (+/-0.025) for {'C': 1000, 'gamma': 0.0001, 'kernel': 'rbf'}
0.975 (+/-0.014) for {'C': 1, 'kernel': 'linear'}
0.975 (+/-0.014) for {'C': 10, 'kernel': 'linear'}
0.975 (+/-0.014) for {'C': 100, 'kernel': 'linear'}
0.975 (+/-0.014) for {'C': 1000, 'kernel': 'linear'}

In spark I have

val cv = new CrossValidator()
  .setEstimator(pipeline)
  .setEvaluator(new MulticlassClassificationEvaluator)
  .setEstimatorParamMaps(paramLRGrid)
  .setNumFolds(3)
val cvModel = cv.fit(trainingData)

I am wondering if there is a similar way as clf.cv_results_ in spark that I can see all the models

score 0 · Answer 1 · answered Jan 01 '19 at 15:25

0

this should help:

cvModel.subModels

as described here: Spark CrossValidatorModel access other models than the bestModel?

answered Jan 01 '19 at 15:25

Matko Soric

107
3
16

Is there a way to output all the cross validation models results in spark scala

1 Answers1