2

Does the Earley parser have expected problems with simple cycles?

I've made my own implementation, but it's pretty similar to this one, which is very readable and about 150 lines total (and I, of course, didn't write it):

http://www.nightmare.com/rushing/python/earley.py

That one looks good to me, and works perfectly in the provided test case, but the very simple grammar

E : [[E,E],[ident]]

doesn't seem to work. It should generate an arbitrary number of "ident" tokens, assuming that E is the starting non-terminal and ident is a terminal. But this grammar, and any similar style grammar, are completely missed by the parser.

Is this a problem in the Earley algoritm (I don't think it is), or a problem in this implementation; if it's implementation based, is there a (relatively) easy solution or does the algorithm need to be rebuilt?

hippietrail
  • 15,848
  • 18
  • 99
  • 158
en_Knight
  • 5,301
  • 2
  • 26
  • 46

1 Answers1

2

Yes, it has an expected problem with groups of rules of the kind:

X1 ::= X2

X2 ::= X3

...

Xn ::= X1

In this case the algorithm may enter in an infinite recursion loop, if you are not checking for states that you've already added to the Earley chart.

In the case you've presented above it must not enter in an infinite loop, because the expansions of the rule E ::= E E will be limited by your input. Check the implementation here:

https://en.wikipedia.org/wiki/Earley_parser#The_algorithm

I have already used this implementation, and it works fine.

  • 1
    If the program is not checking for Earley items already added to the chart, then it's not really an implementation of the Earley algorithm, is it? Earley's specification of the algorithm is quite explicit on the point. – C. M. Sperberg-McQueen Jan 27 '16 at 20:00