I've an implementation of nltk installed on OS X (Lion 10.7.5), for use with Python2.7.
The basic context-free grammars of early chapters work splendidly, but when I attempt to load even basic examples of feature-based context-free grammars, such as:
from __future__ import print_function
import nltk
from nltk import grammar, parse
g = """
% start DP
DP[AGR=?a] -> D[AGR=?a] N[AGR=?a]
D[AGR=[NUM='sg', PERS=3]] -> 'this' | 'that'
D[AGR=[NUM='pl', PERS=3]] -> 'these' | 'those'
D[AGR=[NUM='pl', PERS=1]] -> 'we'
D[AGR=[PERS=2]] -> 'you'
N[AGR=[NUM='sg', GND='m']] -> 'boy'
N[AGR=[NUM='pl', GND='m']] -> 'boys'
N[AGR=[NUM='sg', GND='f']] -> 'girl'
N[AGR=[NUM='pl', GND='f']] -> 'girls'
N[AGR=[NUM='sg']] -> 'student'
N[AGR=[NUM='pl']] -> 'students'
"""
grammar = grammar.FeatureGrammar.fromstring(g)
tokens = 'these girls'.split()
parser = parse.FeatureEarleyChartParser(grammar)
trees = parser.parse(tokens)
for tree in trees: print(tree)
(from: http://www.nltk.org/howto/featgram.html)
... results in the error:
File "test_fcfg.py", line 18, in <module>
grammar = grammar.FeatureGrammar.fromstring(g)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 796, in fromstring
encoding=encoding)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 1270, in read_grammar
productions += _read_production(line, nonterm_parser, probabilistic)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 1220, in _read_production
return [Production(lhs, rhs) for rhs in rhsides]
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 270, in __init__
self._hash = hash((self._lhs, self._rhs))
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 203, in __hash__
self.freeze()
File "/Library/Python/2.7/site-packages/nltk/featstruct.py", line 373, in freeze self._freeze(set())
File "/Library/Python/2.7/site-packages/nltk/featstruct.py", line 395, in _freeze
for (fname, fval) in sorted(self._items()):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 56, in <lambda>
'__lt__': [('__gt__', lambda self, other: other < self),
...
...
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 56, in <lambda>
'__lt__': [('__gt__', lambda self, other: other < self),
RuntimeError: maximum recursion depth exceeded while calling a Python object
(ellipses ... denote many repetitions of the line preceding and succeeding their appearance)
Googling doesn't turn up much of use; really, it doesn't turn up much valuable in terms of errors with nltk in general, which kind of surprised me.
My understanding of the error message is that grammar.FeatureGrammar.fromstring(g) for some reason gets caught in what looks to be an endless loop. Increasing the recursion depth size with the sys module doesn't help at all; I just wait a bit longer before seeing the same error message.
I've noticed with other nltk examples that modules seem to have been moved around; for example, the text "Natural Language Processing with Python" frequently uses commands of the form 'lp = nltk.LogicParser()', but this class appears to have moved to nltk.sem.logic.LogicParser(). However, this doesn't seem to be the cause of the current problem.
Is there a well-known or obvious cause for the error message documented here in nltk? And, possibly, a corrective?