1

Introduction

I am trying to use two non-terminal rules while they are not defined in the same translation unit. A minimal example reproducing the issue is provided below, and is also available live on Coliru

TEST0
Re-using a rule directly (without embedding it into another rule) works OK, despite it is defined in another translation unit. This is the well known X3 program structure example from X3 documentation. This is the configuration TEST0 in the live test below.

TEST1
I initially avoided the use of the BOOST_SPIRIT_DEFINE/DECLARE/INSTANTIATE() macros for one of the non terminal rule with:

auto const parser2 
    = x3::rule<class u2,uint64_t>{"parser2"} 
    = "Trace Address: " >> parser1();

which resulted in an unresolved external symbol linker error. Surprisingly, the missing culprit is the parser1's symbol (and not parser2's), for which the BOOST_XXX macros are used (see unit1.cpp). This is the configuration TEST1

TEST2
I then moved to configuration TEST2 where BOOST_XXX macros are defined for the two rules. This solution compiles and runs with Visual Studio 2019 (v16.8.3) but produces a core dump with g++ (as can been seen on the test below).

Minimal example reproducing the issue

unit1.h

#ifndef UNIT1_H
#define UNIT1_H
#include <cstdint>
#include "boost/spirit/home/x3.hpp"
#include "boost/spirit/include/support_istream_iterator.hpp"

namespace x3 = boost::spirit::x3;
using iter_t = boost::spirit::istream_iterator;
using context_t = x3::phrase_parse_context<x3::ascii::space_type>::type;

namespace unit1 {
    using parser1_t = x3::rule<class u1, std::uint64_t>;
    BOOST_SPIRIT_DECLARE(parser1_t);
}

unit1::parser1_t const& parser1();

#endif /* UNIT1_H */

unit1.cpp

#include "unit1.h"

namespace unit1 {
    parser1_t const parser1 = "unit1_rule";
    auto const parser1_def = x3::uint_;
    BOOST_SPIRIT_DEFINE(parser1)
    BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t)
}
unit1::parser1_t const& parser1() { return unit1::parser1; }

main.cpp

#include <iostream>
#include "unit1.h"

namespace x3 = boost::spirit::x3;
#define TEST2

#ifdef TEST2
    auto const parser2 = x3::rule<class u2, uint64_t>{"parser2"};
    auto const parser2_def = "Trace address: " >> parser1();
    BOOST_SPIRIT_DECLARE(decltype(parser2))
    BOOST_SPIRIT_DEFINE(parser2)
    BOOST_SPIRIT_INSTANTIATE(decltype(parser2),iter_t,context_t)
#endif

int main(int argc, char* argv[])
{
    std::string input("Trace address: 123434");
    std::istringstream i(input);

    std::cout << "parsing: " << input << "\n";

    boost::spirit::istream_iterator b{i >> std::noskipws};
    boost::spirit::istream_iterator e{};

    uint64_t addr=0;
#ifdef TEST0
    bool v = x3::phrase_parse(b, e, "Trace address: " >> parser1(), x3::ascii::space,addr);
#elif defined TEST1
    auto const parser2 
        = x3::rule<class u2, uint64_t>{ "parser2" } 
        = "Trace address: " >> parser1();
    bool v = x3::phrase_parse(b, e, parser2, x3::ascii::space,addr);
#elif defined TEST2
    bool v = x3::phrase_parse(b, e, parser2, x3::ascii::space,addr);
#endif 
    std::cout << "result: " << (v ? "OK" : "Failed") << "\n";
    std::cout << "result: " << addr << "\n";
    return v;
}

I feel I am not doing these things correctly, here are my questions:

Unresolved external symbols and parser Context

In configuration TEST1 the error message is undefined reference to unit1::parse_rule<...> which means the parser1 is not instantiated with the right context. OK, but then what context shall I use in such situation ? Even if I move parser2 out of the main() function, I get more or less the same issue. I can display the context of course, and try to BOOST_SPIRIT_INSTANTIATE() with it but I feel this is not the way to go. Surprisingly, it seems instantiating the parser2 instead, solves the issue (on Visual Studio at least)

Mixing rules from separated translation units

Why is it so complicated, whereas if I remove the rule in parser2, every thing works ok ?

Heyji
  • 1,113
  • 8
  • 26

1 Answers1

1

Q. Why is it so complicated [...]

The machinary to statically link rule definitions to rules by their tag-type (rule-id) is tricky. It in fact hinges on there being a specialization of a parse_rule¹ function template.

However, the function template depends on:

  • the rule id ("tag type")
  • iterator type
  • the context (includes things like skipper or with<> directives)

All of the types must match exactly. This is a frequent source of error.

Q. [...] whereas if I remove the rule in parser2, every thing works ok ?

Likely because either the rule definition is visible to the compiler to instantiate at that point, or alternatively because the types match up as just described.

I'll look at your specific code shortly.

REPRO

Reading The Compiler Messages

My compiler warns with -DTEST1:

unit1.h|13 col 5| warning: ‘bool unit1::parse_rule(unit1::parser1_t, Iterator&, const Iterator&, const Context&, boost::spirit::x3::rule<unit1::u1, long unsigned int>::attribute_type&) [with Iterator = boost::spirit::basic_istream_iterator<char>; Context = boost::spirit::x3::context<main()::u2, const boost::spirit::x3::sequence<boost::spirit::x3::literal_string<const char*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, long unsigned int> >, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, const boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag>, boost::spirit::x3::unused_type> >]’ used but never defined

This spells the exact type arguments for the template specialization to explicitly-instantiate in a TU.

The linker error spells the missing symbol:

/home/sehe/custom/spirit/include/boost/spirit/home/x3/nonterminal/rule.hpp:135: undefined reference to bool unit1::parse_rule<boost::spirit::basic_istream_iterator<char, std::char_traits >, boost::spirit::x3::context<main::u2, boost::spirit::x3::sequence<boost::spirit::x3::literal_string<char const*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, unsigned long, false> > const, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag> const, boost::spirit::x3::unused_type> >

(boost::spirit::x3::rule<unit1::u1, unsigned long, false>, boost::spirit::basic_istream_iterator<char, std::char_traits >&, boost::spirit::basic_istream_iterator<char, std::char_traits > const&, boost::spirit::x3::context<main::u2, boost::spirit::x3::sequence<boost::spirit::x3::literal_string<char const*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, unsigned long, false> > const, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag> const, boost::spirit::x3::unused_type> > const&, unsigned long&)'`

All in all your task is to compare them (!!) and note the discrepancy.

Reading The Macro Magic

Expanding the macros gets

template <typename Iterator, typename Context> inline bool parse_rule( decltype(parser1) , Iterator& first, Iterator const& last , Context const& context, decltype(parser1)::attribute_type& attr) { using boost::spirit::x3::unused; static auto const def_ = (parser1 = parser1_def); return def_.parse(first, last, context, unused, attr); }
template bool parse_rule<iter_t, context_t>( parser1_t rule_ , iter_t& first, iter_t const& last , context_t const& context, parser1_t::attribute_type&);

Which is for the ...DEFINE:

template <typename Iterator, typename Context>
inline bool parse_rule(decltype(parser1), Iterator& first,
    Iterator const& last, Context const& context,
    decltype(parser1)::attribute_type& attr)
{
    using boost::spirit::x3::unused;
    static auto const def_ = (parser1 = parser1_def);
    return def_.parse(first, last, context, unused, attr);
}

And for the explicit ...INSTANTIATE:

template bool parse_rule<iter_t, context_t>(parser1_t rule_, iter_t& first,
    iter_t const& last, context_t const& context,
    parser1_t::attribute_type&);

Substituting out the types shows exactly what is instantiated (see the warning above).

Other Options

Short of straining my eyes, we know what template type params could be wrong, so let's check them:

  1. iterator:

    static_assert(std::is_same_v<iter_t, boost::spirit::istream_iterator>);
    iter_t b{i >> std::noskipws}, e {};
    

    This was not the culprit, the compiler confirms.

  2. The skipper ought to be x3::ascii::space_type which also seems to match up fine.

  3. The problem must be the context. Now let's extract the context from the linker error:

    bool unit1::parse_rule<...> >
    (x3::rule<unit1::u1, unsigned long, false>, iter_t &, iter_t const &,
    
     // this is the context:
     x3::context<
         main::u2,
         x3::sequence<x3::literal_string<char const *,
                                         boost::spirit::char_encoding::standard,
                                         x3::unused_type>,
                      x3::rule<unit1::u1, unsigned long, false>> const,
         x3::context<x3::skipper_tag,
                     x3::char_class<boost::spirit::char_encoding::ascii,
                                    x3::space_tag> const,
                     x3::unused_type>> const &,
    
     // this is the attribtue
     unsigned long &);
    

Doesn't look like the context is actually what we expect. I reckon the problem is that the rule2 definition is "in sight" leading to the context containing the definition (this is the mechanism that allows local x3::rule definitions without define macro magic at all).

I remember a more recent mailing list post pointing this out (and it was kind of a surprise to me back then): https://sourceforge.net/p/spirit/mailman/message/37194823/

On di, 05. jan 13:12, Larry Evans wrote:

However, there's another reason to use BOOST_SPIRIT_DEFINE. When there is a lot of recursive rules, and BOOST_SPIRIT_DEFINE is not used, this causes much heavier template processing and concomitant slow compile times. The reason is that, without BOOST_SPIRIT_DEFINE, the definition for a rule is stored in the context and this is what causes the explosion in compile-times.

So, be aware of this when you notice compile times slow as you add more recursive rules.

Thanks for pointing this out. I've run into this without realizing that omitting the definition-separation was a critical factor.

I guess then that it also could provide relief in some cases that cause extreme template recursion when the rules change skipper (Because the context keeps being technically different).

Again, this is actually a very helpful note. Thanks.

Seth

Earlier in the thread I express reasons why I dislike the macro machinery and never spread my X3 rules across TUs. By now you might appreciate that sentiment :)

Workarounds

You could workaround by manufacturing a correct context type and instantiate that (as well): (unit1.h)

struct u2;
using context2_t = x3::context<
    u2,
    decltype("" >> parser1_t{}) const,
    context_t>;

BOOST_SPIRIT_DECLARE(parser1_t)

And in the cpp:

BOOST_SPIRIT_DEFINE(parser1)
BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t) // optionally
BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context2_t)

Not surprisingly, this works: https://wandbox.org/permlink/Y6NsKCcIDgiHGJf2

Summary

To my own surprise, I once again learn a reason to dislike X3's rule separation magic. However, if you need it, you should probably not mix and match, but define parser2 out-of-line as well.

namespace unit2 {
    parser2_t parser2 = "unit2_rule";
    auto const parser2_def = "Trace address: " >> parser1();

    BOOST_SPIRIT_DEFINE(parser2)
    BOOST_SPIRIT_INSTANTIATE(parser2_t, iter_t, context_t)
} // namespace unit2

See it Live On Wandbox again

Full Listings

For posterity from Wandbox:

  • File unit1.cpp

     #include "unit1.h"
    
     namespace unit1 {
         parser1_t parser1 = "unit1_rule";
         auto const parser1_def = x3::uint_;
    
         BOOST_SPIRIT_DEFINE(parser1)
         BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t)
     } // namespace unit1
     unit1::parser1_t const &parser1() { return unit1::parser1; }
    
  • File unit1.h

     #ifndef UNIT1_H
     #define UNIT1_H
     #include "boost/spirit/home/x3.hpp"
     #include "boost/spirit/include/support_istream_iterator.hpp"
     #include <cstdint>
    
     namespace x3    = boost::spirit::x3;
     using iter_t    = boost::spirit::istream_iterator;
     using context_t  = x3::phrase_parse_context<x3::ascii::space_type>::type;
    
     namespace unit1 {
         using parser1_t = x3::rule<class u1, std::uint64_t> const;
         BOOST_SPIRIT_DECLARE(parser1_t)
     } // namespace unit1
    
     unit1::parser1_t const &parser1();
    
     #endif /* UNIT1_H */
    
  • File unit2.cpp

     #include "unit2.h"
     #include "unit1.h"
    
     namespace unit2 {
         parser2_t parser2 = "unit2_rule";
         auto const parser2_def = "Trace address: " >> parser1();
    
         BOOST_SPIRIT_DEFINE(parser2)
         BOOST_SPIRIT_INSTANTIATE(parser2_t, iter_t, context_t)
     } // namespace unit2
     unit2::parser2_t const &parser2() { return unit2::parser2; }
    
  • File unit2.h

     #ifndef UNIT2_H
     #define UNIT2_H
     #include "boost/spirit/home/x3.hpp"
     #include "boost/spirit/include/support_istream_iterator.hpp"
     #include <cstdint>
    
     namespace x3    = boost::spirit::x3;
     using iter_t    = boost::spirit::istream_iterator;
     using context_t  = x3::phrase_parse_context<x3::ascii::space_type>::type;
    
     namespace unit2 {
         using parser2_t = x3::rule<class u2, std::uint64_t> const;
         BOOST_SPIRIT_DECLARE(parser2_t)
     } // namespace unit2
    
     unit2::parser2_t const &parser2();
    
     #endif /* UNIT2_H */
    
  • File main.cpp

     #include "unit2.h"
     #include <iostream>
    
     namespace x3 = boost::spirit::x3;
    
     int main() {
         std::string input("Trace address: 123434");
         std::istringstream i(input);
    
         std::cout << "parsing: " << input << "\n";
    
         static_assert(std::is_same_v<iter_t, boost::spirit::istream_iterator>);
         iter_t b{i >> std::noskipws}, e {};
    
         uint64_t addr = 0;
         bool v = x3::phrase_parse(b, e, parser2(), x3::ascii::space, addr);
         std::cout << "result: " << (v ? "OK" : "Failed") << "\n";
         std::cout << "result: " << addr << "\n";
         return v;
     }
    
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Solved it - two ways. And reconfirmed my stance from here: https://stackoverflow.com/questions/65566480/boost-spirit-define-not-understand%3E – sehe Feb 03 '21 at 23:50
  • 1
    Thank you for giving your way of debuging this kind of issue, which is much valuable for this library. I need time now to acknowledge all of this (and rest my eyes too !). I understand that moving from visual studio to other compiler might help reading these error messages which seems more readable. – Heyji Feb 04 '21 at 09:04
  • Still, the way to go is not obvious for non experts, and embedding a parser defined in a separate translation unit into a rule is an easy/frequent trap. – Heyji Feb 04 '21 at 09:06
  • Concerning the spread of parsers across several TU, it would be fine it I were not missing a very common one: hex parser over unsigned int of 64 bits...which is nowadays very common for parsing memory addresses. That's the only reason why I used a separate TU. – Heyji Feb 04 '21 at 09:11
  • 1
    In that case: https://godbolt.org/z/z4EEqv (note how I embed the skipped into the rule) – sehe Feb 04 '21 at 14:55
  • Returning a reference to a rule placeholder across TU is penny-wise and pound-foolish. – Nikita Kniazev Feb 04 '21 at 17:47
  • @sehe : there is no such `x3::uint_parser` in x3 documentation, as far as I know. Knowing there is such a beast would have spared me a LOT (months) of time. – Heyji Feb 04 '21 at 20:13
  • You can file a doc change request or add one yourself. Apparently QI docs [mentioned uint_parser](https://www.boost.org/doc/libs/1_75_0/libs/spirit/doc/html/spirit/qi/reference/numeric/uint.html#boost-common-heading-doc-spacer:~:text=The%20uint_parser%20class%20is%20the%20simplest,integers%20of%20arbitrary%20length%20and%20size) but the [x3 doc](https://www.boost.org/doc/libs/1_75_0/libs/spirit/doc/x3/html/spirit_x3/quick_reference/numeric.html) don't anymore. – sehe Feb 04 '21 at 20:44
  • 1
    I have filed a doc change – Heyji Feb 05 '21 at 06:34