9

I wonder if there is a way to specify custom cost function in sklearn/python? My real problem has 7 different classes, but to make it more clear lets assume that I want to specify different cost for misclassification for a problem with 3 different classes and I am mainly interested that my model will properly distinguish between class 1 and class 3.

  • if observation has class 1 and model predicts class 1, penalty is 0 (correct classification)
  • if observation has class 1 and model predicts class 2, penalty is 1
  • if point has class 1 and model predicts class 3, penalty is 2

  • if point has class 2 and model predicts class 2, penalty is 0 (correct classification)
  • if point has class 2 and model predicts class 3, penalty is 1
  • if point has class 2 and model predicts class 1, penalty is 1

  • if point has class 3 and model predicts class 3, penalty is 0 (correct classification)
  • if point has class 3 and model predicts class 2, penalty is 1
  • if point has class 3 and model predicts class 1, penalty is 2

So the penalty matrix would look as follows:

        Class 1  Class 2  Class 3
Class 1   0        1        2
Class 2   1        0        1
Class 3   2        1        0

I assume that the 'class_weight' parameter in sklearn does something similar but accepts a dictionary rather than a matrix. Passing class_weight = {1:2,1:1,1:2} would just increase the weight for misclassifying class 1 and class 3, I ,however, want my model get a larger penalty specifically when it chooses class 1 and true class is class 3 and vice versa.

Is it possible to do something like this in sklearn? May be some other libraries/learning algorithms allow for unequal misclassification cost?

kroonike
  • 1,109
  • 2
  • 13
  • 20

1 Answers1

2

First, in sklearn there is no way to train a model using custom loss. However, you can implement your own evaluation function and adjust hyperparameters of your model to optimize this metric.

Second, you can optimize any custom loss with neural networks, for example, using Keras. But for this purpose, your function should be smooth. The first thing that comes to mind is weighted cross-entropy. In this discussion, people are playing with implementations of this function.

Third, the structure of your own problem suggests that order of class labels is what really matters. If this is the case, you could try ordered logistic regression (an example of its implementation).

Moreover, in your problem the cost is precisely sum(abs(predicted-fact)). So if you don't need probabilistic prediction, you can simply use a regressor that optimizes MAE (e.g. SGDRegressor with 'epsilon_insensitive' loss or DecisionTreeRegressor with mae criterion). After solving the regression, you need only to find the thresholds that optimize your cost function.

David Dale
  • 10,958
  • 44
  • 73