I am trying to use Vowpal Wabbit to do a binary classification, i.e. given feature values vw will classify it either 1 or 0. This is how I have the training data formatted.
1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ...
-1 'name2 | feature1:1 feature2:0 feature3:5 feature4:2565 ...
etc
I have about 30,000 1 data points, and about 3,000 0 data points. I have 100 1 and 100 0 data points that I'm using to test on, after I create the model. These test data points are classified by default as 1. Here is how I format the prediction set:
1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ...
From my understanding of the VW documentation, I need to use either the logistic or hinge loss_function for binary classifications. This is how I've been creating the model:
vw -d ../training_set.txt --loss_function logistic/hinge -f model
And this is how I try the predictions:
vw -d ../test_set.txt --loss_function logistic/hinge -i model -t -p /dev/stdout
However, this is where I'm running into problems. If I use the hinge loss function, all the predictions are -1. When I use the logistic loss function, I get arbitrary values between 5 and 11. There is a general trend for data points that should be 0 to be lower values, 5-7, and for data points that should be 1 to be from 6-11. What am I doing wrong? I've looked around the documentation and checked a bunch of articles about VW to see if I can identify what my problem is, but I can't figure it out. Ideally I would get a 0,1 value, or a value between 0 and 1 which corresponds to how strong VW thinks the result is. Any help would be appreciated!