6

I am creating a formal spec for a very simple rule language, very simple. I want to use EBNF as this is a standard but I can't figure out how to specify order of operations. Here is the specification so far.

rule = statement, { (‘AND’|’OR’), statement};

variable = ‘$’,alphabetic character, {alphabetic character | digit};

statement = variable, [ ‘count’,[white space ],’>’,[white space],number ];

alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G"
                     | "H" | "I" | "J" | "K" | "L" | "M" | "N"
                     | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
                     | "V" | "W" | "X" | "Y" | "Z" ;

number = [ "-" ] , digit , { digit } ;

digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;

white space = ? white space characters ? ; 

The question I have is how do I show that things in brackets should be evaluated first. So something like this

$strap AND ($greenSticker count > 5 OR ($greenSticker AND $redSticker))

It seems like a common feature to most languages, but my Google skills are failing me and I can't seem to find an example.

Dan Walmsley
  • 2,765
  • 7
  • 26
  • 45

1 Answers1

12

Given this as a simplified example LL grammar:

expression -> (+|-|ε) term ((+|-) term)*
term -> factor ((*|/) factor)*
factor -> var | number | (expression)

As you can see, the operators with lower precedence (+ and -) are in a more general rule than the higher precedence operators (* and /). It is all about producing the correct parse tree. But as a rule of thumb, "outer" or more general rules have less precedence, which is why the addition and subtraction operators are placed beside term, because term must be further derived. If you look at more complicated grammars you will see that this is taken into an extreme to have proper precedence.

Austin Henley
  • 4,625
  • 13
  • 45
  • 80
  • 1
    When expressing precedence in grammar rules, I can't see why you are making a distinction between top-down and bottom-up parsing. As you said, it's all about producing the correct parse tree, and you just construct that top-down or bottom-up, to get to the same result eventually. Of course this has a few implications, but it is not obvious to me how you are relating the parsing technique to (this kind of) precedence. Please explain. – Gunther Mar 30 '12 at 08:42
  • 1
    That expression grammar will work in a bottom-up parser too, and in a top-down LL parser as well, not just in recursive-descent parsers. A Pratt parser is recursive-descent and top-down too. Your second paragraph really doesn't make any sense to me. – user207421 Apr 05 '12 at 01:26
  • To address both of your issues, I have just removed the confusing paragraph. I hope the rest of my answer is beneficial! – Austin Henley Apr 05 '12 at 02:31
  • this makes sense but i'm still a little confused. when a parser looks at `3 + 4 * 5`, how does it know not to evaluate `3 + 4` as "expression" first, then evaluate that `expression` as "factor" next, and then evaluate `factor * 5` as "term" last? Does it have anything to do with the *order* of your statements? – chharvey Apr 16 '19 at 22:17
  • @chharvey Yes, it has to do with the order. The rules further down the grammar (i.e., closer to the terminals, like `number`) have a higher precedence. So * and / have higher precedence than + and -. Take your example and try manually parsing it using my grammar. – Austin Henley Apr 17 '19 at 00:21
  • @AustinHenley I did: it still works even though it shouldn't: `3 -> number -> factor -> term[1] ; 4 -> number -> factor -> term[2] ; term[1] + term[2] -> expression -> factor[3] ; 5 -> number -> factor[4] ; factor[3] * factor[4] -> term ;` Am I missing something? – chharvey Apr 17 '19 at 06:24