2

For the following Parslet Parser

require 'parslet'
require 'parslet/convenience'

class Lines < Parslet::Parser
        rule(:open_tag) {str('[')}
    rule(:close_tag) {str(']')}
    rule(:data) {str('name') | str('name_id') }
    rule(:text) { open_tag >> data >> close_tag }
    root :text
end

begin
    p Lines.new.parse("[name_id]")    <---- It throws error
rescue Parslet::ParseFailed => failure
    Lines.new.parse_with_debug("[name_id]")
end

It gives following error

Failed to match sequence (OPEN_TAG NAME CLOSE_TAG) at line 1 char 6.
`- Expected "]", but got "_" at line 1 char 6.

If I change data rule from

rule(:data) {str('name') | str('name_id') }

to

rule(:data) {str('name_id') | str('name') }

then it works as expected.

But, I am generating rules dynamically based on user input. So this solution wont work for me.

Thanks in advance.

Darshan Patel
  • 3,176
  • 6
  • 26
  • 49
  • I think I need more information on what your overall goal is. I can forsee many problems with generating a parser based on user input. – Nigel Thorne May 10 '17 at 23:54

2 Answers2

2

Rule :data is being built and then checked in the order the items were provided. To enforce longer matchers occur before the shorter ones, one might simply sort them:

data = %w|name name_id|

data = data.sort { |a, b| b <=> a }

rule(:data) { data.map(&method(:str)).reduce(:|) }
Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • I used this alternative but I am interested to solve this issue using parslet API to match whole word. – Darshan Patel May 11 '17 at 04:18
  • Parslet consumes input from a stream. If a token is fully matched, then that part of the stream is consumed. If "name" matches fully then the grammar will try to match the rest of the input against the rest of the grammar. so... you need to find a way to say "name" shouldn't match. One way is to put name_id first... so it preferentially matches. The other is to know what token follows name and explicitly match that. – Nigel Thorne May 11 '17 at 05:00
1

As mudasobwa says... name will match, so it doesn't get a chance to try name_id. You either need to change the order so name_id is tried first, or you need to make name fail to match. How you do this depends on your grammar.

require 'parslet'
require 'parslet/convenience'

class Lines < Parslet::Parser
    rule(:open_tag) {str('[')}
    rule(:close_tag) {str(']')}
    rule(:data) { str('name]') | str('name_id]')  } # <-- you can't let a matcher match unless it really is a match, so here it works because name] fails for name_id

    rule(:text) { open_tag >> data  }
    root :text
end

begin
    p Lines.new.parse("[name_id]")   
rescue Parslet::ParseFailed => failure
    Lines.new.parse_with_debug("[name_id]")
end

I think I would instead let the parser break the text up for me, then inspect the structure afterwards.. e.g.

require 'parslet'
require 'parslet/convenience'

class Lines < Parslet::Parser
    rule(:open_tag) {str('[')}
    rule(:close_tag) {str(']')}
    rule(:data) { (close_tag.absnt? >> any).repeat(1).as(:data)  }
    rule(:text) { open_tag >> data >> close_tag }
    root :text
end

begin
    p Lines.new.parse("[name_id]")   # =>  {:data=>"name_id"@1}
rescue Parslet::ParseFailed => failure
    Lines.new.parse_with_debug("[name_id]")
end

Parslet is intended to work in two phases.. the first converts your doc to a tree. The second converts your tree to a data representation that you want.

In this case the first parse pulls out the structure. The seconds pass could check that "name_id" is valid. etc.

Nigel Thorne
  • 21,158
  • 3
  • 35
  • 51