1

I've to figure out the given statement is question or normal statement without defining any chunk grammar. I tried drawing a tree which needs grammer but it doesn't tell me whether it is a question or statement. Penn Treebank is one solution I've heard of but couldn't find any help for this

train_text = state_union.raw("text1.txt")
sample_text = state_union.raw("text2.txt")


custom_sent_tokenizer = PunktSentenceTokenizer(train_text)
#PunktSentenceTokenizer is an abtract class for sent_tokenizer()

tokenized = custom_sent_tokenizer.tokenize(sample_text)
##print (custom_sent_tokenizer)
print (tokenized)
try:        
    for i in tokenized:            
        words = nltk.word_tokenize(i)
        tagged = nltk.pos_tag(words)
        print tagged
        chunkGram = r"""Chunk: {<RB.?>*<VB.?>*<NNP>+<NN>?}"""            
        chunkParser = nltk.RegexpParser(chunkGram)            
        chunked = chunkParser.parse(tagged)           
        print chunked
        chunked.draw() 
except Exception as e:
    print(str(e))

enter image description here

DanielBarbarian
  • 5,093
  • 12
  • 35
  • 44
chanch
  • 11
  • 1
  • I've found one solution for the same in stacoverflow http://stackoverflow.com/questions/6115677/english-grammar-for-parsing-in-nltk but am getting error in from stat_parser import Parser its saying "No module named stat_parser". I tried to look for this module but didn't get any help – chanch Jun 23 '16 at 08:14
  • I used Java stanford-parser.jar, which is working fine – chanch Mar 17 '17 at 04:10

0 Answers0