The Treetagger can do POS-tagging as well as text-chunking, which means extracting verbal and nominal clauses, as in this German example:
$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german
reading parameters ...
tagging ...
finished.
<NC>
Das PDS die
</NC>
<VC>
ist VAFIN sein
</VC>
<NC>
ein ART eine
Test NN Test
</NC>
. $. .
I'm trying to figure out how to do this with the Treetaggerwrapper in Python (since it's faster than directly calling Treetagger), but I can't figure out how it's done. The documentation refers to chunking as preprocessing, so I tried using this:
tags = tagger.tag_text(u"Dieser Satz ist ein Satz.",prepronly=True)
But the output is just a list of the words with no information added. I'm starting to think that what the Wrapper calls Chunking is something different than what the actual tagger calls Chunking, but maybe I'm just missing something? Any help would be appreciated.