3

I have a binary classification problem i am solving with SVM. The classes are unbalanced in the training data. I now need to get posterior probabilities outputs, and not just a binary score. I tried to use Platt scaling by either Weka's SMO, and LibSVM. For both of these implementations i get results which, in terms of f1-measure for the minority class, are worse then when i generated only binary results.

Do you know of a way to transform SVM binary results to probabilities which keeps the next rule: "prob > = 0.5 if and only if decision value >= 0".

Meaning that the label the each sample gets is the same when using either binary classification, or probabilities.

Yonanam
  • 321
  • 1
  • 3
  • 6

1 Answers1

0

SVM can be set so that they output class membership probabilities. You should look documentation of your toolkit to learn how to enable this.

For example sckit-learn

When the constructor option probability is set to True, class membership probability estimates (from the methods predict_proba and predict_log_proba) are enabled.

Atilla Ozgur
  • 14,339
  • 3
  • 49
  • 69
  • I used this setting both in weka's SMO and in LibSVM. There, and also in sckit-learn, the implementation is of Platt scaling. The problem with this solution is that it gives different results comparing to the binary results case. – Yonanam May 17 '15 at 14:41
  • @Yonanam: I know that behaviour for libSVM. But why is it a problem? Is the classifier with probability performing significantly worse? – stefan May 18 '15 at 19:32
  • @stefan: Yes, the performance in terms of f1 measure for the minority class is worse. I think this is because the data is unbalanced, and fitting a logit function is problematic this way. – Yonanam May 25 '15 at 13:56