I have a scripting language that I am implementing where every statement is terminated by a newline, except possibly the last one. It's also possible to have lines that are just newlines. Here's one implementation (statement code omitted since it doesn't really add anything):
@_('statement_set statement', 'statement_set')
def script(self, p):
if len(p) > 1:
p.statement_set.append(p.statement)
return p.statement_set
@_('statement_or_NL')
def statement_set(self, p):
return p.statement_or_NL
@_('statement_set statement_or_NL')
def statement_set(self, p):
p.statement_set.append(p.statement_or_NL)
return p.statement_set
@_('statement NEWLINE', 'NEWLINE')
def statement_or_NL(self, p):
if len(p) > 1:
return [p.statement]
else:
return []
@_('PLUS', 'MINUS')
def statement(self, p):
return p[0]
That generates the following rules:
Rule 0 S' -> script
Rule 1 script -> statement_set
Rule 2 script -> statement_set statement
Rule 3 statement_set -> statement_set statement_or_NL
Rule 4 statement_set -> statement_or_NL
Rule 5 statement_or_NL -> NEWLINE
Rule 6 statement_or_NL -> statement NEWLINE
If you'd like, I can post the generated states as well. The issue I'm running into is that, even when I eliminate all shift-reduce conflicts, I still almost always fail in one of the three scenarios of code that starts with a newline, code that ends with a newline, and code that has no newline at the end. Something goes awry in the parsing such that it will read the final statement (which ends with an EOF) as a "statement NEWLINE" and then error out because there is no new-line. Worse comes to worse, I have a version that works as long as there's no final statement without a newline, so I could probably rig the parser to either add it to the initial string with the code, or to insert the token, but that feels clunky.
I got this working in Ply previously, and I thought it was essentially the same code, but it does not seem to work for me.