0

I tried to build a multi-classification model using lightGBM. After training the model, I parsed some data online and put it into my model for prediction.

However, the result seems weird to me. I thought the nested array of my prediction means the probability for each class(I got 4 classes). The result of x-test (the data I used for validation) seems right. But the result of the data I scraped seems weird. It doesn't add up to 1.

prediction result

In this post multiclass-classification-with-lightgbm, the prediction result didn't add up to 1 as well!

The 2 dataframes look the same to me, and I am using exactly the same model! Can someone pls tell me how to interpret the result or what have I done wrong? x-test dataframe data I scraped dataframe

Ziva Lee
  • 19
  • 1
  • 7
  • Have you used the parameters correctly ?, Passing multiclass in objective and using softmax (Activation function) for probability calibration. – Yash Kumar Atri Dec 11 '19 at 12:07
  • @Yash Kumar Atri Here's how I set my params `params = { 'objective': 'multiclass', 'boosting': 'gbdt', 'num_class': 4, 'num_leaves': 5, 'max_depth': 5, 'learning_rate': 0.003, 'max_bin': 100, 'is_unbalance': 'true', }` I know multiclass use softmax to normalize the raw scores. Hope I don't get it wrong. Thank you! – Ziva Lee Dec 11 '19 at 14:59

1 Answers1

1

For multi-class classification, when the classes are not mutually exclusive, the sum of probabilities may not equal to one. Say for example you are classifying dog, cat, and bird in images but your model is shown a car image, the probabilities for the three classes should be low and not equal to 1. you need rescale the predictions using this formula if you want to force the sum of probabilities to sum to 1.

On the other hand, when you have a classifier of type 1 vs others, for example, the images can only be a cat, a dog, or a bird. In this case, the classes are mutually exclusive, and the probabilities should sum to 1.

More references ref1,ref2,ref3

smerllo
  • 3,117
  • 1
  • 22
  • 37
  • So is it also possible for the probability to be larger than 1? All the probabilities I got in the 2nd scenario are larger than 1. – Ziva Lee Dec 11 '19 at 16:14
  • This should also apply when probabilities are more than 1. You need to resale your outputs – smerllo Dec 11 '19 at 16:42
  • May I ask how is it possible for probability to be greater than 1? Thank you! – Ziva Lee Dec 11 '19 at 17:16
  • Yeah, it's very possible! in your context, what the model tries to do is to evaluate each class in a binary manner `1 vs 0 for each class` so nothing is absurd here. Referring to the exclusivity rule that we talked about can help you a lot understand the concept. does that help? – smerllo Dec 16 '19 at 17:22