I asked treetop to compile your language into an .rb file. That gave me something to dig into:
$ tt -o /tmp/rip.rb /tmp/rip.treetop
Then I used this little stub to recreate the loop:
require 'treetop'
load '/tmp/rip.rb'
RipParser.new.parse('')
This hangs. Now, isn't that interesting! An empty string reproduces the behavior just as well as the dozen-or-so-line example in your question.
To find out where it's hanging, I used an Emacs keyboard macro to edit rip.rb, adding a debug statement to the entry of each method. For example:
def _nt_root
p [__LINE__, '_nt_root'] #DEBUG
start_index = index
Now we can see the scope of the loop:
[16, "root"]
[21, "_nt_root"]
[57, "_nt_statement"]
...
[3293, "_nt_eol"]
[3335, "_nt_semicolon"]
[3204, "_nt_comment"]
[57, "_nt_statement"]
[57, "_nt_statement"]
[57, "_nt_statement"]
...
Further debugging from there reveals that an integer is allowed to be an empty string:
rule integer
digit*
end
This indirectly allows a statement to be an empty string, and the top-level rule statement*
to forever consume empty statements. Changing *
to +
fixes the loop, but reveals another problem:
/tmp/rip.rb:777:in `_nt_object': stack level too deep (SystemStackError)
from /tmp/rip.rb:757:in `_nt_compound_object'
from /tmp/rip.rb:1726:in `_nt_range'
from /tmp/rip.rb:1671:in `_nt_special_literals'
from /tmp/rip.rb:825:in `_nt_literal_object'
from /tmp/rip.rb:787:in `_nt_object'
from /tmp/rip.rb:757:in `_nt_compound_object'
from /tmp/rip.rb:1726:in `_nt_range'
from /tmp/rip.rb:1671:in `_nt_special_literals'
... 3283 levels...
Range is left-recursing, indirectly, via special_literals, literal_object, object, and compound_object. Treetop, when faced with left recursion, eats stack until it pukes. I don't have a quick fix for that problem, but at least you've got a stack trace to go from now.
Also, this is not your immediate problem, but the definition of digit
is odd: It can either one digit, or multiple. This causes digit*
or digit+
to allow the (presumably) illegal integer 1________2
.