4

I am trying to extract the noun phrases from sentences. I am using opennlp librari "en-parser-chunking.bin".

code example:

 ArrayList<opennlp.tools.parser.Parse> nounPhrases = new ArrayList<>();

 searchmethod("what is the nickname of the British flag?");
 for(int t =0; t<50; t++)
 {
     str= text.get(t);
     InputStream is = new FileInputStream("en-parser-chunking.bin");
     ParserModel model = new ParserModel(is);
     opennlp.tools.parser.Parser parser = ParserFactory.create(model);
     opennlp.tools.parser.Parse[] topParses = ParserTool.parseLine(str, parser, 1);
     for (opennlp.tools.parser.Parse p : topParses){
          p.show();
          if (p.getType().equals("NP")) {
              nounPhrases.add(p);
          }
     }                                        
  }

With this code i get the following result:

(TOP (S (NP (NP (DT The) (NN nickname)) (PP (IN for) (NP (DT the) (JJ British) (NN flag)))) (VP (VBZ is) (NP (NP (DT the) (NNP Union) (NNP Jack.)) (SBAR (IN Although) (S (NP (PRP it)) (VP (VBZ is) (ADVP (RB only) (RB correctly)) (VP (VBN known) (PP (IN as) (NP (DT this) (NN when) (NN flown))) (PP (IN on) (NP (DT a) (NN ship.)))))))))))  

How can i extract from that result the noun phrases?

Any help would be greatly appreciated.

2 Answers2

1

You could extract the NPs from that, but there's a model at http://opennlp.sourceforge.net/models-1.5/en-chunker.bin that does just chunking (i.e. noun phrase detection), without grammar. This might be easier to use (but it requires tokenizing and POS tagging steps before it can run).

Daniel Naber
  • 1,594
  • 12
  • 19
  • when i done this i get the following results: B-NP I-NP B-PP B-NP I-NP I-NP B-VP B-NP I-NP I-NP B-SBAR B-NP B-VP I-VP I-VP I-VP B-PP B-NP I-NP B-VP B-PP B-NP I-NP [0..2) NP [2..3) PP [3..6) NP [6..7) VP [7..10) NP [10..11) SBAR [11..12) NP [12..16) VP [16..17) PP [17..19) NP [19..20) VP [20..21) PP [21..23) NP but i want to get the correct sentence. How i will get it? – user4462040 Jan 30 '15 at 20:43
  • Please see http://stackoverflow.com/questions/15059878/opennlp-chunker-and-postag-results for the meaning of the noun phrase tags. – Daniel Naber Jan 31 '15 at 17:11
  • how can I get the entity out of that chunked data?? @DanielNaber – smoothsipai Jun 02 '16 at 09:33
0

Hi I agree with the answer but if you see your output closely there is a problem in the identified tree which will cause wrong chunk detection by the tree.

In the above example there is a PP identified as which is wrong as flown can never be a NN. What I believe is that right postagging is the key. Please let me know if you need to know how postagging can be corrected. Thanks.

(PP 
    (IN as) 
        (NP 
            (DT this) (NN when) (NN flown)
        )
    )
)