3

i am new here working with boost spirit

Reading alot of very good articels for boost Spirit i decide to make an own Parser and run into the Problem that parsing an Expression like this

1+(2+(3+(4+(5+(6+(7+(8)))))))

takes forever on runtime.. making it more simple 1+(2+(3)) works fine. I Looks like the backtracking of the Parser is active. Please give me a hint how to modify the grammer or behaviour to make this run in time.

Here is a bit code from the grammer. I use the "iter_pos" for tracking the Position.

regards Markus

  primary            = functioncall 
                     | constant_double   
                     | constant_integer 
                     | name 
                     | string;

  constant_double   = real_parser< double, strict_ureal_policies<double> >()                    [_val = construct<common_node>(type_const_double, key_value, _1)];
  name              = name_pure_location                                                        [_val = construct<common_node>(type_name, phoenix::bind(&getLocation, _1),key_value, phoenix::bind(&getString, _1))];
  string            = (lexeme[L'"' >> +(boost::spirit::standard_wide::char_ - L'"') >> L'"'])   [_val = construct<common_node>(type_const_string, key_value,phoenix::bind(&makeString, _1))];
  constant_integer  = int_                                                                      [_val = construct<common_node>(type_const_int, key_value, construct<int>(_1))];

  parenthetical =
              lit('(') >> expression >> lit(')')
              | primary;

  unary =
     (iter_pos >> unary_op >> unary >> iter_pos)                            [_val = construct<common_node>(
                                                                                 type_cmd_member_call,
                                                                                 LOCATION(_1,_4),
                                                                                 key_callname, construct<std::wstring>(_2),
                                                                                 key_this,construct<common_node>(_3)
                                                                            )]
    | parenthetical[_val = _1]
    ;

  multiplicative =

    (iter_pos >> unary >> (multiply_op | divide_op | modulo_op) >> multiplicative >> iter_pos)    [_val = construct<common_node>(
                                                                            type_cmd_member_call,
                                                                            LOCATION(_1, _5),
                                                                            key_callname, construct<std::wstring>(_3),
                                                                            key_this, construct<common_node>(_2),
                                                                            key_parameter, construct<common_node>(_4)
                                                                            )]
    | unary[_val = _1];

  additive =

    (iter_pos >> multiplicative >> (add_op | subtract_op) >> additive >> iter_pos) [_val = construct<common_node>(
                                                                                type_cmd_member_call,
                                                                                LOCATION(_1, _5),
                                                                                key_callname, construct<std::wstring>(_3),
                                                                                key_this, construct<common_node>(_2),
                                                                                key_parameter, construct<common_node>(_4)
                                                                              )]
    | multiplicative[_val = _1]
    ;

  compares =

    (iter_pos >> additive >> (compare_op) >> compares >> iter_pos)      [_val = construct<common_node>(
                                                                            type_cmd_member_call,
                                                                            LOCATION(_1, _5),
                                                                            key_callname, construct<std::wstring>(_3),
                                                                            key_this, construct<common_node>(_2),
                                                                            key_parameter, construct<common_node>(_4)
                                                                          )]
    | additive[_val = _1]
    ;

  expression = compares[_val = _1];
ildjarn
  • 62,044
  • 9
  • 127
  • 211
Markus
  • 373
  • 1
  • 11

2 Answers2

2

You correctly identified the source of the trouble: the rules are specified "lazily" (in that they - as they should - descriptively give the productions for the rule).

As you can see, in a PEG grammar this quickly leads to a bad performance if there's a lot of backtracking.

I've already shown optimization of very similar expressions. The summary is this: instead of "expecting" a binary expression and backtracking if it turns out not to be satisfied, parse the first operand "greedily" and compose different AST expression nodes depending on what follows.

Perhaps interesting:

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
0

thx for this Input. I try your sample and it works like expected (and fast) .. anyway i was not able to use the solution from "Boost::Spirit : Optimizing an expression Parser" with my sematic actions. Simply don´t know how to place my actions into this sample .. All the other links are a greate source and i also understand the ideas behind the scene but it seems that this is still a workaround of the more general problem of PEG grammer and backtracking. So is there a more General way to solve this? E.g. using the ">" Operator instead of ">>" to avoid backtracking at some point. So far i failed in using the ">" Operator in my sample.

After playing around with the other info on the links i rewrite the rules to this one and now it works fine and fast.

primary = functioncall | constant_double | constant_integer | name | string;

constant_double = real_parser<double, strict_ureal_policies<double>>()
    [_val = construct<common_node>(type_const_double, key_value, _1)];

name = name_pure_location[_val = construct<common_node>(
                              type_name, phoenix::bind(&getLocation, _1),
                              key_value, phoenix::bind(&getString, _1))];

string = (lexeme[L'"' >> +(boost::spirit::standard_wide::char_ - L'"') >> L'"'])
    [_val = construct<common_node>(type_const_string, key_value,
                                   phoenix::bind(&makeString, _1))];

constant_integer = int_[_val = construct<common_node>(type_const_int, key_value,
                                                      construct<int>(_1))];

parenthetical = lit('(') >> expression >> lit(')') | primary;

// this define is used to make to code more readable ..
#define EXPR_ACTION(NAME)                                                      \
  [_val = construct<common_node>(type_cmd_member_call, LOCATION(_1, _3),       \
                                 key_callname, construct<std::wstring>(NAME),  \
                                 key_this, construct<common_node>(_val),       \
                                 key_parameter, construct<common_node>(_2))]

unary = (iter_pos >> unary_op >> unary >>
         iter_pos)[_val = construct<common_node>(
                       type_cmd_member_call, LOCATION(_1, _4), key_callname,
                       construct<std::wstring>(_2), key_this,
                       construct<common_node>(_3))] 
         | parenthetical[_val = _1];

multiplicative = unary[_val = _1] >> 
           *(iter_pos >> lit('*') >> unary >> iter_pos) EXPR_ACTION(L"_mul") >>
           *(iter_pos >> lit('/') >> unary >> iter_pos) EXPR_ACTION(L"_div");

additive = multiplicative[_val = _1] >>
           *(iter_pos >> lit('-') >> multiplicative >> iter_pos) EXPR_ACTION(L"_sub") >>
           *(iter_pos >> lit('+') >> multiplicative >> iter_pos) EXPR_ACTION(L"_add");

compares =
    additive[_val = _1] >>
    *(iter_pos >> lit('<') >> additive >> iter_pos) EXPR_ACTION(L"_cmp_low") >>
    *(iter_pos >> lit('>') >> additive >> iter_pos) EXPR_ACTION(L"_cmp_gre") >>
    *(iter_pos >> lit('=') >> additive >> iter_pos) EXPR_ACTION(L"_cmp_equ");

logical = compares[_val = _1] >> 
    *(iter_pos >> nocaselit(L"and") >> compares >> iter_pos) EXPR_ACTION(L"_and") >>
    *(iter_pos >> nocaselit(L"or")  >> compares >> iter_pos) EXPR_ACTION(L"_or");

expression = logical[_val = _1];

thank you very much for the good Inputs ! Markus

sehe
  • 374,641
  • 47
  • 450
  • 633
Markus
  • 373
  • 1
  • 11