I'm trying to use Machine Learning to label sentences (each sentence with a single label, I assume sentences are independent from each other). I thought linear CRF model would be ok for this case, but I have some questions.
I tried using CRF++ (other implementations I saw seem to have analogical formats). It uses sentences as input, but the output label is assigned to each token. How to use a single label for the whole sentence? (The hack I thought of would be to assign a significant label only to dot in the test data and treat it as the output label for the whole sentence.)
How can sentences of different length be used? The training configuration requires to specify which tokens are taken into consideration when analysing the current token. But a sentence can have a large or small number of tokens and I want to use all tokens from a sentence (not more or less), to utilise the whole information.
From this question it seems that what I'm trying to do is possible (a single label for the whole sequence), but I don't know how to format training data for that.