How can I integrate stanford parser software in my java program?

Question

I have to develop a project in java that uses a Stanford parser to separate the sentences and has to generate a graph that shows the relation between the words in a sentence. for example: Ohio is located in America. Output:

output

the image shows the graph. But the output need not be the same but it has to show relation between the words in graph form. The graph can be generated using Jgraph, Jung. But initially I have to integrate the parser software into my program. So how can I integrate a parser??

Take a look at the jython interface linked here: http://nlp.stanford.edu/software/lex-parser.shtml — Daniel, Oct 17 '13 at 14:27

score 13 · Accepted Answer · edited Feb 17 '15 at 10:30

Download the Stanford Parser zip:
Add jars to the build path of your project (include the model files)

Use the following snippet to parse sentences and return the constituency trees:(dependency trees can be built by inspecting the structure of the tree)

import java.io.StringReader;
import java.util.List;

import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.process.TokenizerFactory;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.process.CoreLabelTokenFactory;
import edu.stanford.nlp.process.PTBTokenizer;
import edu.stanford.nlp.process.Tokenizer;
import edu.stanford.nlp.trees.Tree;

class Parser {

    private final static String PCG_MODEL = "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";        

    private final TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "invertible=true");

    private final LexicalizedParser parser = LexicalizedParser.loadModel(PCG_MODEL);

    public Tree parse(String str) {                
        List<CoreLabel> tokens = tokenize(str);
        Tree tree = parser.apply(tokens);
        return tree;
    }

    private List<CoreLabel> tokenize(String str) {
        Tokenizer<CoreLabel> tokenizer =
            tokenizerFactory.getTokenizer(
                new StringReader(str));    
        return tokenizer.tokenize();
    }

    public static void main(String[] args) { 
        String str = "My dog also likes eating sausage.";
        Parser parser = new Parser(); 
        Tree tree = parser.parse(str);  

        List<Tree> leaves = tree.getLeaves();
        // Print words and Pos Tags
        for (Tree leaf : leaves) { 
            Tree parent = leaf.parent(tree);
            System.out.print(leaf.label().value() + "-" + parent.label().value() + " ");
        }
        System.out.println();               
    }
}

Could you please give the complete code of a java program that integrates a parser and give the output for a sentence dividing it to words matches to the grammar(in terms of parts of speech) . And i use the output to generate a graph. It would be very useful to complete my project. Thank you — Vishnu Kumar, Oct 17 '13 at 21:01
i did as you mentioned above. I added the jar folders,and build the path. now there is an error showing- cannot find Tree. So could you please suggest what to do? Thank you. — Vishnu Kumar, Oct 18 '13 at 00:11
@VishnuKumar: Sorry! Add `import edu.stanford.nlp.trees.Tree;` — user278064, Oct 18 '13 at 10:33
I imported the package as you mentioned After i runned the program it shows- Build Successful and there's no output . Now how can i get the tokens of the String "My dog also likes eating sausage" as separate words with its nature(noun,pronoun etc). — Vishnu Kumar, Oct 18 '13 at 14:29
running the code above - it won't compile for me (it doesn't recognize `Parser parser = new Parser();`) Instead I made the parser field (`private final LexicalizedParser parser` static and commented out the `Parser...` row and then it works for me — Tomer Cagan, Feb 24 '15 at 21:30

How can I integrate stanford parser software in my java program?

1 Answers1

Linked