I am using the Svmlight package in python to train a SVM rank model. However, I cannot figure out a way to pass the training data to the learn function. My python source code is as follows:
import svmlight
trainingDat = open('train.dat','r')
model = svmlight.learn(trainingDat, type='ranking')
The data file (train.dat) looks like this:
# query 1
3 qid:1 1:1 2:1 3:0 4:0.2 5:0
2 qid:1 1:0 2:0 3:1 4:0.1 5:1
1 qid:1 1:0 2:1 3:0 4:0.4 5:0
1 qid:1 1:0 2:0 3:1 4:0.3 5:0
# query 2
1 qid:2 1:0 2:0 3:1 4:0.2 5:0
2 qid:2 1:1 2:0 3:1 4:0.4 5:0
1 qid:2 1:0 2:0 3:1 4:0.1 5:0
1 qid:2 1:0 2:0 3:1 4:0.2 5:0
# query 3
2 qid:3 1:0 2:0 3:1 4:0.1 5:1
3 qid:3 1:1 2:1 3:0 4:0.3 5:0
4 qid:3 1:1 2:0 3:0 4:0.4 5:1
1 qid:3 1:0 2:1 3:1 4:0.5 5:0
I get the following error on running the code:
TypeError: document should be a tuple
I looked for similar questions and found one: Load svmlight format error
The answer in this link suggests to implement a parser that reads from the data file provided above and convert it to a tuple of features and target. However, when it comes to training a ranker, we need to provide information about the set that an instance belongs to (theoretically).
My question: How to pass training data to the svm learn method when using the ranking configuration?
Thank you in advance!!