3

I am working on a toy parser in golang just to learn the language. I added a test case with grammar covering the following cases:

Valid:
a, ab, aba, ababababababa, ababababab
b, ba, bab, babababababab, bababababa

Invalid:
abb, baa

a is always followed by b and vice versa.

Now the grammar in my parser looks like that (I omit the surrounding code for sake of brevity):

    "expr": Or(Ref("A"), Ref("B")),
    "A": And(
        a,
        Optional(
            And(
                b,
                Optional(Ref("A"))))),
    "B": And(
        b,
        Optional(Ref("A")))

Where

a - exact match for "a" character
b - exact match for "b" character
"A", "B", "expr" - names of the parts of the grammar that can be referred
                   later with Ref("A")
And      - consume sequence of expressions
Or       - consume any of the expressions
Ref      - refer to other expression (allows recursion)
Optional - make the expression non-obligatory

I guess it is not the most succinct way to describe this grammar. How to make it more compact?

Related:


EDIT:

The BNF answer from Filip can be written in my syntax as:

    "expr": Or(Ref("A"), Ref("B")),
    "A":    Or(And(a, Ref("B")), a),
    "B":    Or(And(b, Ref("A")), b)
Community
  • 1
  • 1
Jakub M.
  • 32,471
  • 48
  • 110
  • 179
  • @DenysSéguret: not sure. It is not really about golang but more about parsers and grammar. – Jakub M. Jan 22 '16 at 19:36
  • 2
    @DenysSéguret If it's on-topic here, let it stick here. No point in moving something if it's on-topic here as well. – Mast Jan 22 '16 at 19:37
  • 1
    @DenysSéguret There is no code in this question to review. – 200_success Jan 22 '16 at 19:40
  • As a regular expression, that language is `a?(ba)*b?` but I don't think that helps you parse it with an RDP. (Yes, it is also `b?(ab)*a?` Curiously, the two regular expressions are equivalent.) – rici Jan 22 '16 at 22:02

1 Answers1

3

The BNF grammar you have is this:

expr ::= A | B
A ::= "a" B | "a"
B ::= "b" A | "b"

which I think translates to this using your syntax:

"expr": Or(Ref("A"), Ref("B")),
"A": And(
    a,
    Optional(Ref("B"))),
"B": And(
    b,
    Optional(Ref("A")))

Note that it is important to check terminals ("a", "b") before non-terminals (Ref(x)), or you'll get an infinite loop. It would always try to see if it could match another A or B to the end of the string, and then another, and another, causing a never ending recursion.

Filip Haglund
  • 13,919
  • 13
  • 64
  • 113