The short answer is that you supply attributes of the word coffee (like w[-1]=drank
to indicate the previous word) and its label (NOUN
), and CRFsuite generates the actual indicator functions that compose the CRF model (including a feature that indicates that the label of the previous word is VERB
). It knows to do this because it uses a "1st-order Markov CRF with dyad features," as described on the manual page you linked to.
One distinction that's important to make (and that the documentation could be more precise about) is the difference between "features" and "attributes" where features are links in the model that represent either (attribute, label) or (label, label) pairs.
So in your example, w[-1]=drank
is an attribute that you supply. The combination of w[-1]=drank, NOUN
is a state feature and the transition between labels VERB --> NOUN
is a transition feature, both of which are generated by CRFsuite.
I recommend the tutorial, which discusses this in more detail.