I am trying to parse Markdown using a grammar written with Parslet. However, I cannot get past indented code blocks because everything I tried so far got stuck in recursion. They look like this:
This is a indented code block.
Second line.
Code block continues after blank line.
There can be any number of chunks,
separated by not more than one blank line.
In order to solve this I wrote a minimal example wich replaces the lines (including \n
) with a
and blank lines (\n\n
) with spaces, eg: a aaa aa
.
# recurring_group_parser.rb
require 'parslet'
require 'rspec'
require 'parslet/rig/rspec'
class RecurringGroupParser < Parslet::Parser
root(:block)
rule :block do
chunk.repeat(1,3)
end
rule :chunk do
str('a').repeat(1,3) >> space
end
rule :space do
str(' ') | chunk.absent?
end
end
describe RecurringGroupParser do
it 'should parse a' do
is_expected.to parse "a"
end
it 'should parse aa' do
is_expected.to parse "aa"
end
it 'should parse aaa' do
is_expected.to parse "aaa"
end
it 'should parse a a' do
is_expected.to parse "a a"
end
it 'should parse aa a' do
is_expected.to parse "aa a"
end
it 'should parse aaa a' do
is_expected.to parse "aaa a"
end
it 'should parse a aa' do
is_expected.to parse "a aa"
end
it 'should parse a aaa' do
is_expected.to parse "a aaa"
end
it 'should parse aa a' do
is_expected.to parse "aa a"
end
it 'should parse aa aa' do
is_expected.to parse "aa aa"
end
it 'should parse aa aaa' do
is_expected.to parse "aa aaa"
end
it 'should parse aaa aa' do
is_expected.to parse "aaa aa"
end
it 'should parse aaa aaa' do
is_expected.to parse "aaa aaa"
end
it 'should parse a a a' do
is_expected.to parse "a a a"
end
it 'should parse aa a a' do
is_expected.to parse "aa a a"
end
it 'should parse aaa a a' do
is_expected.to parse "aaa a a"
end
it 'should parse a aa a' do
is_expected.to parse "a aa a"
end
it 'should parse aa aa a' do
is_expected.to parse "aa aa a"
end
it 'should parse aaa aa a' do
is_expected.to parse "aaa aa a"
end
it 'should parse a aaa a' do
is_expected.to parse "a aaa a"
end
it 'should parse aa aaa a' do
is_expected.to parse "aa aaa a"
end
it 'should parse aaa aaa a' do
is_expected.to parse "aaa aaa a"
end
it 'should parse a a aa' do
is_expected.to parse "a a aa"
end
it 'should parse aa a aa' do
is_expected.to parse "aa a aa"
end
it 'should parse aaa a aa' do
is_expected.to parse "aaa a aa"
end
it 'should parse a aa aa' do
is_expected.to parse "a aa aa"
end
it 'should parse aa aa aa' do
is_expected.to parse "aa aa aa"
end
it 'should parse aaa aa aa' do
is_expected.to parse "aaa aa aa"
end
it 'should parse a aaa aa' do
is_expected.to parse "a aaa aa"
end
it 'should parse aa aaa aa' do
is_expected.to parse "aa aaa aa"
end
it 'should parse aaa aaa aa' do
is_expected.to parse "aaa aaa aa"
end
it 'should parse a a aaa' do
is_expected.to parse "a a aaa"
end
it 'should parse aa a aaa' do
is_expected.to parse "aa a aaa"
end
it 'should parse aaa a aaa' do
is_expected.to parse "aaa a aaa"
end
it 'should parse a aa aaa' do
is_expected.to parse "a aa aaa"
end
it 'should parse aa aa aaa' do
is_expected.to parse "aa aa aaa"
end
it 'should parse aaa aa aaa' do
is_expected.to parse "aaa aa aaa"
end
it 'should parse a aaa aaa' do
is_expected.to parse "a aaa aaa"
end
it 'should parse aa aaa aaa' do
is_expected.to parse "aa aaa aaa"
end
it 'should parse aaa aaa aaa' do
is_expected.to parse "aaa aaa aaa"
end
end
Running rspec recurring_group_parser.rb
works fine. Only when I put the newlines back in, it stalls:
# recurring_group_parser.rb
require 'parslet'
require 'rspec'
require 'parslet/rig/rspec'
class RecurringGroupParser < Parslet::Parser
root(:block)
rule :block do
chunk.repeat(1,3)
end
rule :chunk do
line.repeat(1,3) >> blank_line
end
rule :line do
str('a') >> newline
end
rule :blank_line do
newline.repeat(2) | chunk.absent?
end
rule :newline do
str("\n") | any.absent?
end
end
describe RecurringGroupParser do
it 'should parse a' do
is_expected.to parse "a"
end
it 'should parse aa' do
is_expected.to parse "a\na"
end
it 'should parse aaa' do
is_expected.to parse "a\na\na"
end
it 'should parse a a' do
is_expected.to parse "a\n\na"
end
it 'should parse aa a' do
is_expected.to parse "a\na\n\na"
end
it 'should parse aaa a' do
is_expected.to parse "a\naa\n\na"
end
it 'should parse a aa' do
is_expected.to parse "a\n\na\na"
end
it 'should parse a aaa' do
is_expected.to parse "a\n\na\na\na"
end
it 'should parse aa a' do
is_expected.to parse "a\na\n\na"
end
it 'should parse aa aa' do
is_expected.to parse "a\na\n\na\na"
end
it 'should parse aa aaa' do
is_expected.to parse "a\na\n\na\na\na"
end
it 'should parse aaa aa' do
is_expected.to parse "a\naa\n\na\na"
end
it 'should parse aaa aaa' do
is_expected.to parse "a\naa\n\na\na\na"
end
it 'should parse a a a' do
is_expected.to parse "a\n\na\n\na"
end
it 'should parse aa a a' do
is_expected.to parse "a\na\n\na\n\na"
end
it 'should parse aaa a a' do
is_expected.to parse "a\naa\n\na\n\na"
end
it 'should parse a aa a' do
is_expected.to parse "a\n\na\na\n\na"
end
it 'should parse aa aa a' do
is_expected.to parse "a\na\n\na\na\n\na"
end
it 'should parse aaa aa a' do
is_expected.to parse "a\naa\n\na\na\n\na"
end
it 'should parse a aaa a' do
is_expected.to parse "a\n\na\naa\n\na"
end
it 'should parse aa aaa a' do
is_expected.to parse "a\na\n\na\naa\n\na"
end
it 'should parse aaa aaa a' do
is_expected.to parse "a\naa\n\na\naa\n\na"
end
it 'should parse a a aa' do
is_expected.to parse "a\n\na\n\na\na"
end
it 'should parse aa a aa' do
is_expected.to parse "a\na\n\na\n\na\na"
end
it 'should parse aaa a aa' do
is_expected.to parse "a\naa\n\na\n\na\na"
end
it 'should parse a aa aa' do
is_expected.to parse "a\n\na\na\n\na\na"
end
it 'should parse aa aa aa' do
is_expected.to parse "a\na\n\na\na\n\na\na"
end
it 'should parse aaa aa aa' do
is_expected.to parse "a\naa\n\na\na\n\na\na"
end
it 'should parse a aaa aa' do
is_expected.to parse "a\n\na\naa\n\na\na"
end
it 'should parse aa aaa aa' do
is_expected.to parse "a\na\n\na\naa\n\na\na"
end
it 'should parse aaa aaa aa' do
is_expected.to parse "a\naa\n\na\naa\n\na\na"
end
it 'should parse a a aaa' do
is_expected.to parse "a\n\na\n\na\na\na"
end
it 'should parse aa a aaa' do
is_expected.to parse "a\na\n\na\n\na\na\na"
end
it 'should parse aaa a aaa' do
is_expected.to parse "a\naa\n\na\n\na\na\na"
end
it 'should parse a aa aaa' do
is_expected.to parse "a\n\na\na\n\na\na\na"
end
it 'should parse aa aa aaa' do
is_expected.to parse "a\na\n\na\na\n\na\na\na"
end
it 'should parse aaa aa aaa' do
is_expected.to parse "a\naa\n\na\na\n\na\na\na"
end
it 'should parse a aaa aaa' do
is_expected.to parse "a\n\na\naa\n\na\na\na"
end
it 'should parse aa aaa aaa' do
is_expected.to parse "a\na\n\na\naa\n\na\na\na"
end
it 'should parse aaa aaa aaa' do
is_expected.to parse "a\naa\n\na\naa\n\na\na\na"
end
end
To simplify this, lines can only consist of a single a
and are not indented but that can easily be changed later and are not related to the failure to finish parsing. I am also pretty sure that there is a collision between chunk.absent?
in rule :blank_line
and any.absent?
in rule :newline
but I have no idea how to fix this and provide criteria to break the recursion. Any help wanted!