I am trying to train a LambdaMART model to perform a pairwise sort of a list of objects. My training dataset consists of 50,000, 112-dimensional feature vectors. Each feature is coded by a non-negative integer.
The target value is a positive integer (not consecutive). Given two new instances, X and Y, I want my model to be able to predict if the target value for X is greater than Y.
Since this is not an information retrieval application, the concept of a query is irrelevant. All 50,000 instances belong to the same "query".
It seems that when I run my model, even with a setting to use a 70%/30% train-validate split, I get 0 deviance on my validation set, and the gbm.perf function throws an exception if I try to do OOB method for finding optimal number of trees.
Overall, I'm pretty confused as to what this package is doing with all these unhelpfully named parameters. All I want to know do is specify a test-validation set and then minimize the validation error over the range of tree sizes. Shouldn't be too much, but this package is making it so difficult to know which knobs I need to set...so much so that I'm about to implement it myself just so I have some transparency and know what its doing.
Sorry for the rant, but I could use some help to get this pacakge to return meaningful validation results.