4

I'm exceptionally new to jison and have managed to piece together a helpful query parser. I'm now trying to create a parser that can parse a string like "a == 1 and b == 1 and c == 1" into an object like

{and: [
  {a: {eq: 1}},
  {b: {eq: 1}},
  {c: {eq: 2}}
]}

while a string like "a == 1 or b == 1 and c == 1" should parse into an object like

{or: [
  {a: {eq: 1}},
  {and: [
    {b: {eq: 1}},
    {c: {eq: 1}}
  ]}
]}

My grammar looks like this so far:

%lex

%%
\s+       /*skip whitespace*/
\"(\\.|[^"])*\"          yytext = yytext.substr(1, yyleng-2); return 'STRING';
"=="                     return '==';
and[^\w]                 return 'and';
or[^\w]                  return 'or';
[0-9]+(?:\.[0-9]+)?\b    return 'NUMBER';
[a-zA-Z][\.a-zA-Z0-9_]*  return 'SYMBOL';
<<EOF>>                  return 'EOF';

/lex

%left 'or'
%left 'and'
%left '=='

%start expressions

%%

expressions
  : e EOF
    {$$ = $1;}
  ;

e
  : property '==' value
    { $$ = {}; $[$1] = {eq: $3}; }
  | boolAnd 
    { $$ = {and: $1}}
  | boolOr 
    { $$ = {or: $1}}
  ;

boolAnd
  : boolAnd 'and' e
    {$$ = $1; $1.push($3);}
  | e 'and' e
    {$$ = [$1, $3];}
  ;

boolOr
  : boolOr 'or' e
    {$$ = $1; $1.push($3);}
  | e 'or' e
    {$$ = [$1, $3];}
  ;

property
  : SYMBOL
    {$$ = $1;}
  ;

value
  : NUMBER
    {$$ = Number(yytext);}
  | STRING
    {$$ = yytext; }
  ;

and it gives me the following conflict errors:

Conflict in grammar: multiple actions possible when lookahead token is and in state 4
- reduce by rule: e -> boolAnd
- shift token (then go to state 11)
Conflict in grammar: multiple actions possible when lookahead token is or in state 5
- reduce by rule: e -> boolOr
- shift token (then go to state 12)

States with conflicts:
State 4
  e -> boolAnd . #lookaheads= EOF and or
  boolAnd -> boolAnd .and e #lookaheads= EOF and or
State 5
  e -> boolOr . #lookaheads= EOF and or
  boolOr -> boolOr .or e #lookaheads= EOF or and

Is anyone able to offer a suggestion on what I am doing wrong? Many thanks

jonotron
  • 83
  • 7

1 Answers1

1

Since

e : boolAnd

It is impossible to decide between:

boolAnd: e 'and' e
       | boolAnd 'and' e

which is what jison is complaining about. (And it's worth noting that the reduction of boolAnd to e does not seem to be what you want. It's really a type error, or would be if JS had types.)

Personally, I'd just use binary trees; in my experience, they turn out to be easier to work with. You can do that easily with a single non-terminal and precedence declarations.

%left 'or'
%left 'and'
%%
e
  : property '==' value
    { $$ = {eq: [$1, $3]}; /* This seems better to me. YMMV. }
  | e 'and' e
    { $$ = {and: [$1, $3]}}
  | e 'or' e
    { $$ = {or: [$1, $3]}}
  | '(' e ')'
    { $$ = $2  /* You didn't have this one, but it seems useful */ }
  ;

It is possible to make a grammar which will handle variadic operators (i.e.. reduce x OP y OP z to {OP: [x, y, z]}) but it's actually quite a bit of work to get it right, and it does not yield easily to a solution based on precedence declarations. Unless you really want to distinguish between x OP y OP z and x OP (y OP z), which is unnecessary in the case of boolean operators, it is usually easier and more general to collapse multiple similar binary operators in a second pass over the parse tree, or directly when creating the binary node, by examining the operator types of the child expressions.

rici
  • 234,347
  • 28
  • 237
  • 341
  • This was helpful. I had not considered doing a second pass over the tree to collapse same operators. I ended up writing a recursion that flattened my binary boolean operator arrays and it works nicely. Though I would be curious what the flex would be for such a thing given your caveats. Much thanks! – jonotron May 10 '16 at 18:30