I'm not sure exactly what you are attempting to do, but take a look at Treetop it will let you define a grammar file and will compile the grammar to a parser in Ruby. It's a PEG parser, so it's also easier to work with than traditional LALR parsers.
Here's an example parsing a bit of Ruby (of course you will have to extend the grammar to fit your needs which may be difficult since Ruby is rather complex to parse):
require 'treetop'
Treetop.load_from_string DATA.read
parser = TestParser.new
p parser.parse('def func
6 + 5
end')
__END__
grammar Test
rule function
'def' space function_name function_body 'end'
end
rule function_name
[A-Za-z]+
end
rule function_body
space expression space
end
rule expression
'6 + 5'
end
rule space
[\t \n]+
end
end
Parsing this returns an AST:
SyntaxNode+Function0 offset=0, "...ef func\n 6 + 5\nend" (space,function_name,function_body):
SyntaxNode offset=0, "def"
SyntaxNode offset=3, " ":
SyntaxNode offset=3, " "
SyntaxNode offset=4, "func":
SyntaxNode offset=4, "f"
SyntaxNode offset=5, "u"
SyntaxNode offset=6, "n"
SyntaxNode offset=7, "c"
SyntaxNode+FunctionBody0 offset=8, "\n 6 + 5\n" (space1,expression,space2):
SyntaxNode offset=8, "\n ":
SyntaxNode offset=8, "\n"
SyntaxNode offset=9, " "
SyntaxNode offset=10, " "
SyntaxNode offset=11, " "
SyntaxNode offset=12, "6 + 5"
SyntaxNode offset=17, "\n":
SyntaxNode offset=17, "\n"
SyntaxNode offset=18, "end"
Also, you can compile a treetop grammar file into Ruby code using the tt
command line tool.
tt test.treetop -o test-treetop.rb