I'm trying to use SVMLight to build a classifier to detect if a Noun Phrase(NP) is anaphoric or not. I have my features but I'm stuck at understanding the format of the input file, should I translate all my text to this format or I put only the NP that represent positive instance and negative instance. And is there any software that allow me to translate my file to this format.
<line> .=. <target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>
<target> .=. +1 | -1 | 0 | <float> //for positive instance should I put +1
<feature> .=. <integer> | "qid" //should I do this line for all my feature
<value> .=. <float>
<info> .=. <string> //Should this contain the NP
Also, for the model file what should this file contain exactly?
Your help would be very much appreciated.