Maybe my partial solution will help to clarify what I meant in the question.
Let's say you have a somewhat non-trivial parser:
class MyParser < Parslet::Parser
rule(:dollars) {
match('[0-9]').repeat(1).as(:dollars)
}
rule(:comma_separated_dollars) {
match('[0-9]').repeat(1, 3).as(:dollars) >> ( match(',') >> match('[0-9]').repeat(3, 3).as(:dollars) ).repeat(1)
}
rule(:cents) {
match('[0-9]').repeat(2, 2).as(:cents)
}
rule(:currency) {
(str('$') >> (comma_separated_dollars | dollars) >> str('.') >> cents).as(:currency)
# order is important in (comma_separated_dollars | dollars)
}
end
Now if we want to parse a fixed-width Currency string; this isn't the easiest thing to do. Of course, you could figure out exactly how to express the repeat expressions in terms of the final width, but it gets really unnecessarily tricky, especially in the comma separated case. Also, in my use case, currency is really just one example. I want to be able to have an easy way to come up with fixed-width definitions for adresses, zip codes, etc....
This seems like something that should be handle-able by a PEG. I managed to write a prototype version, using Lookahead as a template:
class FixedWidth < Parslet::Atoms::Base
attr_reader :bound_parslet
attr_reader :width
def initialize(width, bound_parslet) # :nodoc:
super()
@width = width
@bound_parslet = bound_parslet
@error_msgs = {
:premature => "Premature end of input (expected #{width} characters)",
:failed => "Failed fixed width",
}
end
def try(source, context) # :nodoc:
pos = source.pos
teststring = source.read(width).to_s
if (not teststring) || teststring.size != width
return error(source, @error_msgs[:premature]) #if not teststring && teststring.size == width
end
fakesource = Parslet::Source.new(teststring)
value = bound_parslet.apply(fakesource, context)
return value if not value.error?
source.pos = pos
return error(source, @error_msgs[:failed])
end
def to_s_inner(prec) # :nodoc:
"FIXED-WIDTH(#{width}, #{bound_parslet.to_s(prec)})"
end
def error_tree # :nodoc:
Parslet::ErrorTree.new(self, bound_parslet.error_tree)
end
end
# now we can easily define a fixed-width currency rule:
class SHPParser
rule(:currency15) {
FixedWidth.new(15, currency >> str(' ').repeat)
}
end
Of course, this is a pretty hacked solution. Among other things, line numbers and error messages are not good inside of a fixed width constraint. I would love to see this idea implemented in a better fashion.