78

The question is in bold at the bottom, the problem is also summarized by the distillation code fragment towards the end.

I am trying to unify my type system (the type system does to and from from type to string) into a single component(as defined by Lakos). I am using boost::array, boost::variant, and boost::mpl, in order to achieve this. I want to have the parser and generator rules for my types unified in a variant. there is a undefined type, a int4(see below) type and a int8 type. The variant reads as variant<undefined, int4,int8>.

int4 traits:

    struct rbl_int4_parser_rule_definition
    {
      typedef boost::spirit::qi::rule<std::string::iterator, rbl_int4()> rule_type;
      
      boost::spirit::qi::int_parser<rbl_int4> parser_int32_t;
      
      rule_type rule;
      
      rbl_int4_parser_rule_definition()
      {
        rule.name("rbl int4 rule");
        rule = parser_int32_t;  
      }
    };
    
    template<>
    struct rbl_type_parser_rule<rbl_int4>
    {
      typedef rbl_int4_parser_rule_definition string_parser;
    };

the variant above starts out as undefined, and then I initialize the rules. I had a problem, which caused 50 pages of errors, and I have finally managed to track it down, Variant uses operator= during assignment and a boost::spirit::qi::int_parser<> cannot be assigned to another (operator=).

To contrast, I don't have a problem with my undefined type:

    struct rbl_undefined_parser_rule_definition
    {
      typedef boost::spirit::qi::rule<std::string::iterator, void()> rule_type;
      rule_type rule;
      
      rbl_undefined_parser_rule_definition()
      {
        rule.name("undefined parse rule");
        rule = boost::spirit::qi::eps;
      }
    };
    
    template<>
    struct rbl_type_parser_rule<rbl_undefined>
    {
      typedef rbl_undefined_parser_rule_definition string_parser;
    };

Distillation of the problem:

    #include <string>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/variant.hpp>
    #include <boost/cstdint.hpp>
    
    typedef boost::spirit::qi::rule<std::string::iterator,void()> r1;
    typedef boost::spirit::qi::rule<std::string::iterator,int()> r2;
    
    typedef boost::variant<r1,r2> v;
    
    int main()
    {
      /*
      problematic
      boost::spirit::qi::int_parser<int32_t> t2;
      boost::spirit::qi::int_parser<int32_t> t1;
      
      
      t1 = t2;
      */
    
      //unproblematic
      r1 r1_;
      r2 r2_;
      r1_ = r2_;
    
      v v_;
      // THIS is what I need to do.
      v_ = r2();
    }

There is a semantic gap between concrete parsers and rules. My brain is smoking at the moment so I am not going to think about pramatism, My question is, how do I solve this problem ? I can think of three approaches to solve the problem.

one: Static function members:

    struct rbl_int4_parser_rule_definition
    {
      typedef boost::spirit::qi::rule<std::string::iterator, rbl_int4()> rule_type;
      
      //boost::spirit::qi::int_parser<rbl_int4> parser_int32_t;
      
      rule_type rule;
      
      rbl_int4_parser_rule_definition()
      {
        static boost::spirit::qi::int_parser<rbl_int4> parser_int32_t;
    
        rule.name("rbl int4 rule");
        rule = parser_int32_t;  
      }
    };

I guess approach one prevents thread safe code ? ?

two: The integral parser is wrapped in a shared_ptr. There are two reasons I'm bothering with TMP for the typing system: 1 efficiency, 2 centralizing concerns into components. using pointers defeats the first reason.

three: operator= is defined as a no-op. variant guarantees that the lhs is default constructed before assignment.

Edit: I am thinking option 3 makes the most sense (operator= is a no-op). Once the rule container is created it will not change, and I am only assigning to force a type's rule trait into its offset.

Waqar
  • 8,558
  • 4
  • 35
  • 43
Hassan Syed
  • 20,075
  • 11
  • 87
  • 171
  • 1
    option 1 is thread unsafe only if: `parser_int32_t` has state *and* a reference is taken. If is stateless or a copy is made, then it is safe. From the semantics, I would say a copy is made. – Matthieu M. May 19 '11 at 15:33
  • It is quite a confusing concern, I cannot be certain that the parser object does not have state. Also, there are reference and concrete semantics with the rule mechanics -i.e., a rule can hold references to other rules, but they can also be concrete parsers themselves (I think), and I don't know how these semantics apply to concrete parsers. – Hassan Syed May 19 '11 at 15:45
  • @MatthieuM : Right, a copy is made unless `.alias()` is used. – ildjarn May 19 '11 at 17:24
  • @ildjarn but a rule is not a concrete parser :D the contents of a rule are an expression, the equivalent of a parse tree. – Hassan Syed May 19 '11 at 18:00
  • @Hassan: what version of spirit/compiler are you using. It seems to at least compile for me (the part in comments, t1 = t2;) with msvc10 and spirit from boost 1.45. – n1ckp Jul 06 '11 at 13:10
  • 1
    I can't evaluate whether #1 would be thread-safe or not, but I can give an ounce of advice that's easy to forget. A static assignment is only ever evaluated by the compiler once. Imagine a little check in the code (if (!evaluated_yet) evaluate() else noop()). the first time any rbl_int4_parser_rule_definition's relevant member object is called anywhere, it will be constructed that one time. *it is almost absolutely equivalent to using a global singleton.* could you use a global singleton of that type to solve the same problem? (ignoring inti. order etc.) if so, this should be thread-safe. – std''OrgnlDave Jan 02 '12 at 03:20
  • How about picking a better tool for the job? – zvrba Jan 27 '12 at 15:11
  • @zvrba I suppose I could write my own recursive descent parser, domain specific (which I will probably end up doing) :D I asked this question a long while ago. Spirit is quite an interesting framework, if C++ TMP was easier to work with, I would recommend it to anyone doing any parsing. – Hassan Syed Jan 27 '12 at 17:13
  • I've just tried to compile your "distillation" code, and it compiled without error (boost 1.39, gcc 4.3.2) – celtschk Jan 28 '12 at 13:57

1 Answers1

11

I'm not so sure I get the full extent of the question, but here are a few hints

  • The line commented with // THIS is what I need to do. compiles fine with me (problem solved? I'm guessing you actually meant assigning a parser, not a rule?)

  • Initialization of function-local static has been defined to be thread safe in the latest standard (C++11). Check your compiler support for C++0x threading. (If the initializer throws, a pass of the initialization statement will try to initialize again, by the way).

  • rules alias()

    As described in http://boost-spirit.com/home/articles/doc-addendum/faq/#aliases

    You can create 'logical copies' of rules without having to actually value-copy the proto expression. As the FAQ says, this is mainly to allow lazy-binding

  • The Nabialek Trick might be precisely what you need, basically it lazily selects a parser for subsequent parsing

    one = id;
    two = id >> ',' >> id;
    
    keyword.add
        ("one", &one)
        ("two", &two)
        ;
    
    start = *(keyword[_a = _1] >> lazy(*_a));
    

    In your context, I could see keyword defined as

    qi::symbols<char, qi::rule<Iterator>*> keyword;
    

    doing all the work with attributes from semantic actions. Alternatively,

    qi::symbols<char, qi::rule<Iterator, std::variant<std::string,int>() >*> keyword;
    
  • Bring the rules under the same type (like shown in the previous line, basically)

    This is the part where I'm getting confused: You say you want to unify your type system. There might not be a need for strongtyped parsers (distinct attribute signatures).

    typedef boost::variant<std::string,int> unified_type;
    typedef qi::rule<std::string::iterator, unified_type() > unified_rule;
    
    unified_rule rstring = +(qi::char_ - '.');
    unified_rule rint    = qi::int_;
    
    unified_rule combine = rstring | rint;
    
sehe
  • 374,641
  • 47
  • 450
  • 633