How can we write Sanskrit grammar rules for parsing in NLTK Python? Is there any tagged corpus available in Python NLTK?
I tried to write a grammar as usual like this:
grammar = CFG.fromstring("""
S -> NP VP
PP -> P NP
NP -> NN JJ| NNP VP| 'I'
VP -> V NP | VP PP
NN -> u'बालः' | u'पुस्तकं'|u'कागदम्'
VP -> u'भजति'|u'अधावत्' |u'अर्चयन्ति'
NNP -> u'हरिं '
""")
But it returns an error as below:
File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 519, in fromstring
encoding=encoding)
File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 1245, in read_grammar
lines = input.split('\n')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 76: ordinal not in range(128)
I started with python 3 but even after installing nltk package, it returns an error 'ImportError: No module named 'nltk''. Can anyone tell me how to install NLTK for python 3, and why it gives such an error message?