9

I have been wondering what the value of alpha (weight of a weak classifier) should be when it has an error rate(perfect classification) since the algorithm for alpha is (0.5) * Math.log(((1 - errorRate) / errorRate))

Thank you.

4 Answers4

6

If you're boosting by reweighting and passing to the weak learner the whole training data, I'd say that you found a weak classifier that is in fact strong, after all it flawlessly classified your data.

In this case, it should happen in the first Adaboost iteration. Add that weak classifier to your strong classifier with an alpha set to 1 and stop the training.

Now, if that happened while you're boosting by resampling, and your sample is only a subset of your training data, I believe you should discard this subset and retry with another sample.

I believe you reached such result because you're playing with a very simple example, or your training dataset is very small or isn't representative. It's also possible that your weak classifier is too weak and is approaching random guessing too quickly.

Ramiro
  • 698
  • 6
  • 21
3

Nominally, the alpha for the weak classifier with zero error should be large because it classifies all training instances correctly. I'm assuming you're using all training data to estimate alpha. It's possible you're estimating alpha only with the training sample for that round of boosting as well--in which case your alpha should be slightly smaller based on the sample size--but same idea.

In theory, this alpha should be near infinity if your other alphas are unnormalized. In practice, the suggestion to check if your error is zero and give those alphas a very high value is reasonable, but error rates of zero or near zero typically indicate you're overfitting (or just have too little training data to estimate reliable alphas).

This is covered in section 4.2 of Schapire & Singer's Confidence Rated Predictions version of Adaboost. They suggest adding a small epsilon to your numerator and denominator for stability:

alpha = (0.5) * Math.log(((1 - errorRate + epsilon) / (errorRate + epsilon)))

In any event, this alpha shouldn't be set to a small value (it should be large). And setting it to 1 only makes sense if all other alphas for all other rounds of boosting are normalized so the sum of all alphas is almost 1, e.g..

Ramiro
  • 698
  • 6
  • 21
JayKay
  • 39
  • 1
1

I ran into this problem a few times and usually what I do is to check if error is equal to 0 and if it is, set it equal to 1/10 of the minimum weight. It is a hack, but it usually ends up working pretty well.

nikola
  • 11
  • 1
  • What do you mean to set it equal to 1/10 of min weight? – James Euangel E. Limpiado Feb 22 '13 at 18:08
  • 1
    As a part of Adaboost, you have weights for examples that you are using for your training. Those examples are typically set to 1/norm in the beginning and updated at each iteration of Adaboost. Your error is simply the sum of the weights for examples that your weak classifier gets incorrect. If your error is zero, just set it to 1/10 of the minimum of those weights, thus saying that your classifier is so good (but not perfect) that it got wrong only 1/10 of the least important example. – nikola Feb 23 '13 at 13:12
0

It is actually better if you do not use such a classifier in your prediction of Adaboost as it would not improve it much as it is not a weak classifier and will tend to eat up all the weight.

sww
  • 856
  • 1
  • 8
  • 15