Adjust custom loss function for gradient boosting classification

Question

I have implemented a gradient boosting decision tree to do a mulitclass classification. My custom loss functions look like this:

import numpy as np
from sklearn.preprocessing import OneHotEncoder
def softmax(mat):
    res = np.exp(mat)
    res = np.multiply(res, 1/np.sum(res, axis=1, keepdims=True))
    return res
def custom_asymmetric_objective(y_true, y_pred_encoded):
    pred = y_pred_encoded.reshape((-1, 3), order='F')
    pred = softmax(pred)
    y_true = OneHotEncoder(sparse=False,categories='auto').fit_transform(y_true.reshape(-1, 1))
    grad = (pred - y_true).astype("float")
    hess = 2.0 * pred * (1.0-pred)
    return grad.flatten('F'), hess.flatten('F')


def custom_asymmetric_valid(y_true, y_pred_encoded):
    y_true = OneHotEncoder(sparse=False,categories='auto').fit_transform(y_true.reshape(-1, 1)).flatten('F')
    margin = (y_true - y_pred_encoded).astype("float")
    loss = margin*10
    return "custom_asymmetric_eval", np.mean(loss), False

Everything works, but now I want to adjust my loss function in the following way: It should "penalize" if an item is classified incorrectly, and a penalty should be added for a certain constraint (this is calculated before, let's just say the penalty is e.g. 0,05, so just a real number). Is there any way to consider both, the misclassification and the penalty value?

`Deviance` loss, which used in `GradientBoostingClassifier` would already penalize the misclassification. What is the special constraint, which you want to add? Can you add the details about it. — Venkatachalam, Mar 09 '19 at 12:01
Is it possible to adjust the deviance loss such that also the penalty is added? To understand the constraint the whole model is needed...but I think it is enough to know that the penalty for the constraint (which is calculated before) is something between 0 and 5% — Sarah, Mar 11 '19 at 07:38

razimbres · Answer 1 · 2019-03-09T12:02:07.137

0

Try L2 regularization: weights will be updated following the subtraction of a learning rate times error times x plus the penalty term lambda weight to the power of 2

Simplifying:

This will be the effect:

ADDED: The penalization term (on the right of equation) increases the generalization power of your model. So, if you overfit your model in training set, the perfomance will be poor in test set. So, you penalize these "right" classifications in training set that generate error in test set and compromise generalization.

edited Mar 09 '19 at 12:02

answered Mar 09 '19 at 11:42

razimbres

4,715
5
23
50

Thanks for the answer. I just don't get where the misclassification (so the difference between y_true and y_pred) is penalized? – Sarah Mar 09 '19 at 11:49
Please check the text I added to response at ADDED. – razimbres Mar 09 '19 at 12:01
You can do it from scratch, and it depends upon the algorithm you are using. For Neural Networks, Keras has L1 and L2 regularization options. You also have Ridge Regression. For Trees, like Random Forests, Decision Trees and Gradient Boosted Trees, you can do something similar, if you make tree pruning = decrease overfit by removing branches of the tree. Also, remember that the misclassifications in your training data may be due to sistematic of random error and some of them cannot be fixed. Example `X=[0,1,2,1]` and `Y=[0]` and `X=[0,1,2,1]` and `Y=[1]` in the same dataset. – razimbres Mar 09 '19 at 12:42
@Sarah, check this link : https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html – razimbres Mar 09 '19 at 12:42
Is this approach appropriate for the classification task? I mean linear least squares is actually used for regression isn't it? – Sarah Mar 11 '19 at 07:40
It's the same principle, penalize weights. – razimbres Mar 11 '19 at 11:41
If I use the Ridge classifier (https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifier.html#sklearn.linear_model.RidgeClassifier): Where in the code is it possible to add my penalty term? – Sarah Mar 18 '19 at 10:28
When you calculate y_pred. You will act on weights penalty. – razimbres Mar 18 '19 at 10:43

Adjust custom loss function for gradient boosting classification

1 Answers1

Linked