I wonder if there is a way to specify custom cost function in sklearn/python? My real problem has 7 different classes, but to make it more clear lets assume that I want to specify different cost for misclassification for a problem with 3 different classes and I am mainly interested that my model will properly distinguish between class 1 and class 3.
- if observation has class 1 and model predicts class 1, penalty is 0 (correct classification)
- if observation has class 1 and model predicts class 2, penalty is 1
- if point has class 1 and model predicts class 3, penalty is 2
- if point has class 2 and model predicts class 2, penalty is 0 (correct classification)
- if point has class 2 and model predicts class 3, penalty is 1
- if point has class 2 and model predicts class 1, penalty is 1
- if point has class 3 and model predicts class 3, penalty is 0 (correct classification)
- if point has class 3 and model predicts class 2, penalty is 1
- if point has class 3 and model predicts class 1, penalty is 2
So the penalty matrix would look as follows:
Class 1 Class 2 Class 3
Class 1 0 1 2
Class 2 1 0 1
Class 3 2 1 0
I assume that the 'class_weight' parameter in sklearn does something similar but accepts a dictionary rather than a matrix. Passing class_weight = {1:2,1:1,1:2} would just increase the weight for misclassifying class 1 and class 3, I ,however, want my model get a larger penalty specifically when it chooses class 1 and true class is class 3 and vice versa.
Is it possible to do something like this in sklearn? May be some other libraries/learning algorithms allow for unequal misclassification cost?