7

I would like to generate a sentence having as input words. E.g.

Input:

Mary
chase
the monkey

Output:

Mary chases the monkey.

This could be done using a simpleNLG library: http://code.google.com/p/simplenlg/ in the following way:

String subject = "Mary";
String verb = "chase";
String object = "the monkey";

p.setSubject(subject);
p.setVerb(verb);
p.setObject(object);

String output = realiser.realiseSentence(p);
System.out.println(output);

This will generate the sentence Mary chases the monkey. But I would like to make it automated where I input words and the sentence gets generated. This would require some preprocessing that would specify which word is a subject which word is a verb and which is an object. I know there are POS (parts of speech) tagging libraries but they don't specify whether it is a subject or object. Any suggestions how this could be done? Also for make it work for bigger sentences with multiple objects, adverbs etc.

Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
Radek
  • 1,403
  • 3
  • 25
  • 54
  • I'm not sure what you are asking. Do you want to enter a bag of words (where the order isn't considered) and have a sentence output? How would the program know if you wanted "Mary chases the monkey" or "The monkey chases Mary"? – Chris Jun 02 '11 at 13:13
  • Parsers (OpenNLP, Stanford) start with a sentence and tell you what plays the role of subject, object etc. – Chris Jun 02 '11 at 13:14

3 Answers3

3

Most common approach is to build ngramm statistics and then build most propable sequnce of words. Oen famous example can be found here http://scribe.googlelabs.com/

yura
  • 14,489
  • 21
  • 77
  • 126
1

In order to obtain the subject, verb or object for the input sentence you need to perform syntactic analysis or parsing.

There are two main groups of parsing tools, constituent parsers and dependency parsers, but usually the former is the more direct path to obtain what you need.

These are some research constituent parsers that you may try:

This related question may also help: Simple Natural Language Processing Startup for Java

Community
  • 1
  • 1
zdepablo
  • 452
  • 3
  • 6
0

It would depend on the order of the words. If the order is Mary chase the monkey then the output would be Mary chases the monkey. If the order is the monkey chase Mary then the output would be The monkey chases Mary.

I had a look at the OpenNLP parser but it takes as input a sentence which is being parsed. What I have as input is words and I need to build a sentence.

And anyway when I look at the example: The quick brown fox jumps over the lazy dog .

The parser should now print the following to the console. (TOP (NP (NP (DT The) (JJ quick) (JJ brown) (NN fox) (NNS jumps)) (PP (IN over) (NP (DT the) (JJ lazy) (NN dog))) (. .)))

All I can see is parts of speech. I can't see it specifying objects, subjects etc. unless there is such a function in the API.

If I am wrong, correct me.

Radek
  • 1,403
  • 3
  • 25
  • 54