10

I am attempting to use multiple metrics in GridSearchCV. My project needs multiple metrics including "accuracy" and "f1 score". However, after following the sklearn models and online posts, I can't seem to get mine to work. Here is my code:

from sklearn.model_selection import GridSearchCV
from sklearn.metrics import f1_score
clf = KNeighborsClassifier()

param_grid = {'n_neighbors': range(1,30), 'algorithm': ['auto','ball_tree','kd_tree', 'brute'], 'weights': ['uniform', 'distance'],'p': range(1,5)}

#Metrics for Evualation:
met_grid= ['accuracy', 'f1'] #The metric codes from sklearn

custom_knn = GridSearchCV(clf, param_grid, scoring=met_grid, refit='accuracy', return_train_score=True)

custom_knn.fit(X_train, y_train)
y_pred = custom_knn.predict(X_test)

My error occurs on the custom_knn.fit(X_train,y_train). Further more, if you comment-out the scoring=met_grid, refit='accuracy', return_train_score=True, it works. Here is my error:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

Also, if you could explain multiple metric evaluation or refer me to someone who can, that would be much appreciated!
Thanks

Tanner Clark
  • 631
  • 1
  • 8
  • 19

1 Answers1

17

f1 is a binary classification metric. For multi-class classification, you have to use averaged f1 based on different aggregation. You can find the exhaustive list of scoring available in Sklearn here.

Try this!

scoring = ['accuracy','f1_macro']

custom_knn = GridSearchCV(clf, param_grid, scoring=scoring, 
                          refit='accuracy', return_train_score=True,cv =3)
Venkatachalam
  • 16,288
  • 9
  • 49
  • 77
  • 2
    Thank yo so much! That worked. Now, trying to understand: My knn classifier is now attempting to optimize the hyper parameters with accuracy AND f1 score as the overall metrics? – Tanner Clark Dec 30 '18 at 18:14
  • 7
    @TannerClark No, it will only optimize the scorer present in `refit`. So in above code, the `accuracy` is being optimized. `f1` is just monitored. – Vivek Kumar Jan 02 '19 at 10:49
  • If one gets a Type error like I did, it is because refit is by default True, so if you use multiple scorings you either have to declare it as False or use a single scorer (either a value or a scoring function), but you must use the keyword. – petsol Aug 16 '23 at 12:18