1

I am trying to find an optimal parameter set for an XGB_Classifier using GridSearchCV. Since my data is very unbalanced, both fitting and scoring (in cross_validation) must be performed using weights, therefore I have to use a custom scorer, which takes a 'weights' vector as a parameter. However, I can't find a way to have GridSearchCV pass 'weights' vector to a scorer.

There were some attempts to add this functionality to gridsearch:

https://github.com/ndawe/scikit-learn/commit/3da7fb708e67dd27d7ef26b40d29447b7dc565d7

But they were not merged into master and now I am afraid that this code is not compatible with upstream changes.

Has anyone faced a similar problem and is there any 'easy' way to cope with it?

petrovich
  • 13
  • 2

1 Answers1

1

You could manually balance your training dataset as in the answer to Scikit-learn balanced subsampling

Community
  • 1
  • 1
maxymoo
  • 35,286
  • 11
  • 92
  • 119