4

I would like to be able to parse a Number, to store its original source and to track its position in the source preserving it in the structure itself.

This is what I have so far:

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_object.hpp>
#include <boost/spirit/home/support/iterators/line_pos_iterator.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>

#include <iostream>
#include <iomanip>
#include <ios>
#include <string>
#include <complex>

#include <boost/spirit/include/phoenix_fusion.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>

struct Position
{
    Position()
        : line(-1)
    {
    }

    size_t line;
};

struct Number : public Position
{
    Number()
        : Position()
        , value(-1)
        , source()
    {
    }

    unsigned    value;
    std::string source;
};

using namespace boost::spirit;

BOOST_FUSION_ADAPT_STRUCT(Number,
                            (unsigned,    value)
                            (std::string, source)
                            (size_t,      line)
                          );

template <typename Iterator>
struct source_hex : qi::grammar<Iterator, Number()>
{
    source_hex() : source_hex::base_type(start)
    {
        using qi::eps;
        using qi::hex;
        using qi::lit;
        using qi::raw;
        using qi::_val;
        using qi::_1;
        using ascii::char_;

        namespace phx = boost::phoenix;
        using phx::at_c;
        using phx::begin;
        using phx::end;
        using phx::construct;

        start = raw[   (lit("0x") | lit("0X"))
                     >> hex [at_c<0>(_val) = _1]
                   ][at_c<2>(_val) = get_line(begin(_1))]
                    [at_c<1>(_val) = construct<std::string>(begin(_1), end(_1))]

        ;
    }

    qi::rule<Iterator, Number()> start;
};

and the test code is:

typedef line_pos_iterator<std::string::const_iterator> Iterator;
source_hex<Iterator> g;
Iterator iter(str.begin());
Iterator end(str.end());

Number number;
bool r = parse(iter, end, g, number);
if (r && iter == end) {
    std::cout << number.line << ": 0x" << std::setw(8) << std::setfill('0') << std::hex << number.value << " // " << number.source << "\n";
} else
    std::cout << "Parsing failed\n";

what I am not getting is why the iterator on line:

[at_c<2>(_val) = get_line(begin(_1))]

is not a line_pos_iterator even this is the one I am using for the parser. I will appreciate explanation as well as ideas how to solve the problem - in whatever way.

gsf
  • 6,612
  • 7
  • 35
  • 64
  • 1
    and obviously what I am doing is completely off - because the get_line is called during the construction of the grammar – gsf Oct 02 '13 at 19:46
  • 1
    you need to call `get_line` as a 'lazy' functor (a Phoenix Actor). See [this answer](http://stackoverflow.com/questions/8358975/cross-platform-way-to-get-line-number-of-an-ini-file-where-given-option-was-foun/8365427#8365427) for an example (Inifile parser) that uses it – sehe Oct 02 '13 at 19:53

1 Answers1

6

Have a look at

#include <boost/spirit/repository/include/qi_iter_pos.hpp>

This defines a parser that directly exposes the position as an attribute. Let me add an example in a few minutes.

Edit I found it hard to shoe-horn iter_pos into your sample without "assuming" things and changing your data type layout. I'd very much favour this (I'd strive to lose the semantic actions all the way.). However, time's limited.

Here's a little helper that you can use to fix your problem:

struct get_line_f
{
    template <typename> struct result { typedef size_t type; };
    template <typename It> size_t operator()(It const& pos_iter) const
    {
        return get_line(pos_iter);
    }
};

^ The polymorphic actor, use as such:

    start = raw[ qi::no_case["0x"] >> hex [at_c<0>(_val) = _1] ]
               [ 
                   at_c<1>(_val) = construct<std::string>(begin(_1), end(_1)),
                   at_c<2>(_val) = get_line_(begin(_1)) 
               ]
    ;

    // with

boost::phoenix::function<get_line_f> get_line_;

Note I changed a few minor points.

Fully running demo with output: Live On Coliru

sehe
  • 374,641
  • 47
  • 450
  • 633
  • looks `raw` not working as expected. Example on Coliru parse both "0x 1234" and "0x1234" without syntax error. Is it possible to treate space as an error for case "0x 1234"? – Dewfy May 27 '22 at 09:21
  • 1
    Raw has nothing to do with what input is accepted (it only defines what output is synthesized). What you're looking for are lexemes: https://stackoverflow.com/a/17073965/85371. Note "not working as expected" and "without syntax errors" are both subjective. My answer did *not* assume an opinion there other than the code from the question. If you have a different opinion than person X in 2013, fine :) I say that makes your question a new/different question. – sehe May 27 '22 at 12:53
  • you are right about "subjectivenes", thanks for a list of hints! – Dewfy May 27 '22 at 13:30
  • 1
    Here's the minimal "fix" and some modernizations: http://coliru.stacked-crooked.com/a/94525d5bc86ba85a – sehe May 27 '22 at 13:35
  • I see, you just removed any skippers from `qi::rule<> ... number`. Thank you! – Dewfy May 27 '22 at 13:48
  • @Dewfy Some more modernization seems in order http://coliru.stacked- crooked.com/a/0fd6999745e3e144 - double the functionality in half the code – sehe May 27 '22 at 14:31