0

Bison newbie here.

I'm trying to add a feature to a my toy calculator, which is the possibility to omit the multiplication operator * before any variable, so that it can parse stuff like: 3x * 2y.

I've reduced the program down to a very simple parser, which only supports multiplication (both explicit and omitted), where numbers are represented by the character n and variables are represented by the character v.

Here's an input example, similar to the one above, that should be accepted:

nv * nv

More complicated examples should also work (please ignore whitespace):

nvv * n * nvvv

Here is the code file calc.y:

%start  expression
%left '*'

%%
expression: 
    'n'
|   'v'
|   expression 'v' %prec '*'
|   expression '*' expression
;
%%

"Compiled" with:

bison calc.y

and I get the following warning:

calc.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]

From my limited understanding of this, it seems the %prec '*' thingy isn't doing anything. How do I tell Bison to match either rule in the order they appear? Thanks!

SlySherZ
  • 1,631
  • 1
  • 16
  • 25
  • You're aware that your grammar doesn't allow `3*x`, right? So when you get around to implementing addition, you'll have to write `1x+1y` for `x+y` – rici Sep 21 '18 at 22:49
  • I missed it, nice catch! Already edited the answer to include this case, the warning is the same. Thanks : ) – SlySherZ Sep 22 '18 at 00:34
  • Fortunately, your edit doesn't want invalidate my answer. Otherwise I'd be grumpy. – rici Sep 22 '18 at 00:47

2 Answers2

1

You are almost certainly better off not using precedence to disambiguate this construction. It can be made to work, but it's awkward and harder to understand than the alternative (at least, in my opinion.) Here are a couple of SO answers which explore writing grammars with explicit precedence order for expressions including an "invisible" operator: How can I have an “implicit multiplication” rule with Bison? and Is it possible to make this YACC grammar unambiguous? expr: … | expr expr.

The yacc/bison precedence algorithm is described in the bison manual and more briefly in a number of SO answers, several of which quote the following paragraph (originally from here.)

Recall that a precedence relation is defined between a production and a terminal. It does not relate two terminals nor two productions (and so cannot be used to resolve reduce-reduce conflicts). The comparison between precedence of the production which could be reduced and the lookahead terminal determines whether a reduce or a shift will occur. For notational convenience, productions are represented by the name of a terminal, usually the only terminal in the production; this corresponds to a common use case but it is sometimes confusing. In particular, the %prec declaration only serves to give a rule a name for use in precedence declarations, and it is probably better to think about it in that way rather than as an "explicit" declaration.

So your precedence declaration is doing nothing to resolve the ambiguity in

expression 'v' %prec '*'

because the relevant lookahead token is 'v', and that doesn't have any declared order in the precedence table.

Moreover, you might not really want adjacent multiply to have the same precedence as explicit multiply, because 2*3v does not seem to be (2*3)v (which is what 2*3*v is parsed as), but rather 2*(3v). Not everyone would agree with that -- and it is part of the reason that adjacent multiply is problematic -- and there are people who would say that 2/3v means (2/3)*v. We can agree to differ on that, but it's something to watch out for.

Anyway, if you really want to do this with precedence relations, you need to include in your precedence order all the tokens which could possible be at the beginning of an expression. (They should all be in a single precedence level at the end of the list.)

There's a list of SO answers about adjacency as an operator at the end of this answer; some of them include examples of precedence ordering as a solution. In your case, you have two options, depending on how you believe explicit and implicit multiplication should interact:

%left '*'
%left 'v' /* And any other token which could start an expression */

or

%left '*' 'v' /* Again, add other tokens which could start an expression */

Since in both cases 'v' is in the precedence order, there is no need to modify the default precedence of the expression 'v' production; the default precedence (which is v because that's the last terminal in the production) will work fine:

expression: 'n' | expression 'v' | expression '*' expression

However, if you someday decide that 2(3+v) is a valid implicit multiply, then you would need to change that to:

expression: 'n' | '(' expression ')' | expression '*' expression
          | expression expression %prec 'v'

Now a %prec declaration is necessary because expression expression has not default precedence.

You could add it to the precedence order, in which case you could remove the %prec declaration from the production:

rici
  • 234,347
  • 28
  • 237
  • 341
  • By carefully reading this, I realized precedence is not the best way to solve the problem, so I added a complimentary answer that solves the problem with a different grammar. Your answer helped me understand a bunch of things, thanks! :) – SlySherZ Sep 22 '18 at 01:44
0

If I understood @rici's answer correctly, it is not a good idea to use precedence to solve this, because we would have to set the precedence for the token which comes after 'expression', which in this case is just 'v', but there could be many of them (and we would have to set it for all of them).

The alternative is to restrict the things that can appear before the 'v' so that there is no ambiguity to start with, like so (notice how the thing that comes before 'v' cannot include explicit multiplications):

%start  expression
%left '*'

%%
term:
    'n'
|   'v'
|   term 'v'
;

expression: 
    term
|   expression '*' expression
;
%%
SlySherZ
  • 1,631
  • 1
  • 16
  • 25