I'm using OpenNLP to extract noun phrases from a chunk of text. Unfortunately, the documentation for OpenNLP is extremely confusing.
At the moment,I have two arrays: one with the tokenized text and another with the POS tags for the tokenized text. I fed these two strings into the chunker function, but the chunker just labels the words in the text as O, B-PP, B-NP, I-NP, etc.
What I'd like to do is have an array of strings that contains only the noun phrases from the text, rather than an array of strings that labels the tokenized text as different phrases. Is there already some sort of function in OpenNLP that can return the noun phrases in an array of strings (or even in a data structure other than an array)?
This looked like a related post, but I don't think we're doing the same thing, as they're using the parsing tree to accomplish their goals. How to extract the noun phrases using Open nlp's chunking parser
Any help would be greatly appreciated. Thanks in advance!