Vowpal Wabbit not predicting binary values, maybe overtraining?

Question

I am trying to use Vowpal Wabbit to do a binary classification, i.e. given feature values vw will classify it either 1 or 0. This is how I have the training data formatted.

1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ...
-1 'name2 | feature1:1 feature2:0 feature3:5 feature4:2565 ...
etc

I have about 30,000 1 data points, and about 3,000 0 data points. I have 100 1 and 100 0 data points that I'm using to test on, after I create the model. These test data points are classified by default as 1. Here is how I format the prediction set:

1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ...

From my understanding of the VW documentation, I need to use either the logistic or hinge loss_function for binary classifications. This is how I've been creating the model:

vw -d ../training_set.txt --loss_function logistic/hinge -f model

And this is how I try the predictions:

vw -d ../test_set.txt --loss_function logistic/hinge -i model -t -p /dev/stdout

However, this is where I'm running into problems. If I use the hinge loss function, all the predictions are -1. When I use the logistic loss function, I get arbitrary values between 5 and 11. There is a general trend for data points that should be 0 to be lower values, 5-7, and for data points that should be 1 to be from 6-11. What am I doing wrong? I've looked around the documentation and checked a bunch of articles about VW to see if I can identify what my problem is, but I can't figure it out. Ideally I would get a 0,1 value, or a value between 0 and 1 which corresponds to how strong VW thinks the result is. Any help would be appreciated!

No, is that required? I didn't think that the order of the data had any importance, only the feature values. — , Jul 26 '16 at 16:28
If the training data contains first all negative examples followed by all positive examples then online learning (used in vw by default unless you specify `--bfgs`) will fail to train anything and will predict (almost) only positive labels. Random shuffling of the training data prevents this common pitfall. It is not strictly required if your training data are already shuffled (or if they follow some natural chronological order). — Martin Popel, Jul 26 '16 at 18:22
#1 issue in the question description is it breaks the principle "train as you test". You can't use `{1,-1}` labels for training and `{0,1}` labels for testing. #2 example order is critically important in online learning (unclear what your order is). #3 weight of 0 of an input feature is ignored. Also see similar Qs: https://stackoverflow.com/questions/24822288/correctness-of-logistic-regression-in-vowpal-wabbit/24832382#24832382 and https://stackoverflow.com/questions/24634602/how-to-perform-logistic-regression-using-vowpal-wabbit-on-very-imbalanced-datase/24641529#24641529 — arielf, Aug 01 '16 at 02:32

score 2 · Answer 1 · answered Jul 25 '16 at 21:44

If the output should be just -1 and +1 labels, use the --binary option (when testing).
If the output should be a real number between 0 and 1, use --loss_function=logistic --link=logistic. The loss_function=logistic is needed when training, so the number can be interpreted as probability.
If the output should be a real number between -1 and 1, use --link=glf1.

If your training data is unbalanced, e.g. 10 times more positive examples than negative, but your test data is balanced (and you want to get the best loss on this test data), set the importance weight of the positive examples to 0.1 (because there are 10 times more positive examples).

score 0 · Answer 2 · answered Jul 25 '16 at 19:51

Independently of your tool and/or specific algorithm you can use "learning curves" ,and train/cross validation/test splitting to diagnose your algorithm and determine whats your problem . After diagnosing your problem you can apply adjustments to your algorithm, for example if you find you have over-fitting you can apply some actions like:

Add regularization
Get more training data
Reduce the complexity of your model
Eliminate redundant features.

You can reference Andrew Ng. "Advice for machine learning" videos on YouTube to more details on this subject.

Thanks for the advice, did you see anything obviously wrong with the way I've set up my data or the way I'm training and running predictions with VW? — , Jul 25 '16 at 20:23

Vowpal Wabbit not predicting binary values, maybe overtraining?

2 Answers2