2

I want to use qi::int_parser<int64_t> to parse an integer value (it's really convenient how it automatically checks for overflows, handles the INT_MIN case, and so on). But, I also want to get the substring that the int_parser matched, because I want to print a warning message if it has extraneous characters (i.e., a plus sign, leading zeroes, or the case of -0).

I saw in another answer the suggestion to use qi::as_string, but it doesn't seem to work in this case. Here is some code that illustrates the issue:

#include <boost/phoenix/core.hpp>
#include <boost/phoenix/operator.hpp>
#include <boost/spirit/include/qi.hpp>
#include <cstdint>
#include <iostream>
#include <string>
int main() {
    namespace phx = boost::phoenix;
    namespace qi = boost::spirit::qi;
    using namespace std;

    std::string value_str;
    int64_t value;
    std::string test_str = "<+123>";

    const auto success = qi::parse(test_str.begin(), test_str.end(),
        qi::char_('<') >>
        qi::as_string[
            qi::int_parser<int64_t>{}[phx::ref(value) = qi::_1]
        ][phx::ref(value_str) = qi::_1] >>
        qi::char_('>')
    );

    std::cout << "success: " << success << '\n';
    std::cout << "value: " << value << '\n';
    std::cout << "matched substring: " << value_str << '\n';
} 

The output I want is

success: 1
value: 123
matched substring: +123

The output I get is

success: 1
value: 123
matched substring: {

(or some other garbage). Parsing the int value works just fine, but I can't figure out how to get the substring.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312

2 Answers2

2

Use qi::raw before passing it to qi::as_string:

qi::raw[qi::int_parser<int64_t>{}[phx::ref(value) = qi::_1]]

Result:

success: 1
value: 123
matched substring: +123
doqtor
  • 8,414
  • 2
  • 20
  • 36
1

I'd simplify using a proper rule - so you don't need to spell out as_string in the parse expression.

There is something wrong with the way it works in this particular case (which should be reported as a bug to the library maintainers). However I could work around it by adding an eps parser inside the raw directive:

r %= '<' >> raw[int64_[phx::ref(value) = _1] >> eps] >> '>';

(Note also there is no need to parse the brackets with char_).

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iomanip>
using namespace std::string_literals;

int main()
{
    namespace phx = boost::phoenix;
    namespace qi = boost::spirit::qi;

    qi::rule<std::string::const_iterator, std::string()> r;

    int64_t value;
    {
        using namespace qi;
        static const int_parser<int64_t> int64_{};
        r %= '<' >> raw[int64_[phx::ref(value) = _1] >> eps] >> '>';
    }

    using Lim = std::numeric_limits<decltype(value)>;
    for (std::string const test_str : {
             "<+123>"s,
             "<0123>"s,
             "<0123>"s,
             "<123>"s,
             "<" + std::to_string(Lim::max()) + ">",
             "<" + std::to_string(Lim::min()) + ">",
         })
    {
        std::string value_str;

        auto success
            = qi::parse(test_str.begin(), test_str.end(), r, value_str);

        std::cout << "success: " << std::boolalpha  << success << "\n";
        std::cout << "value: "             << value                  << "\n";
        std::cout << "matched substring: " << std::quoted(value_str) << "\n";
    }
}

Prints

success: true
value: 123
matched substring: "+123"
success: true
value: 123
matched substring: "0123"
success: true
value: 123
matched substring: "0123"
success: true
value: 123
matched substring: "123"
success: true
value: 9223372036854775807
matched substring: "9223372036854775807"
success: true
value: -9223372036854775808
matched substring: "-9223372036854775808"

BONUS

Encapsulate the "value" param as well, so you don't use globals:

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iomanip>
using namespace std::string_literals;
namespace phx = boost::phoenix;
namespace qi = boost::spirit::qi;

using It = std::string::const_iterator;
using T = std::int64_t;

struct Parser : qi::grammar<It, std::string(T&)> {
    Parser() : Parser::base_type(r) {
        r %= '<' >> qi::raw[ int64_[qi::_r1 = qi::_1] >> qi::eps ] >> '>';
    }
  private:
    qi::int_parser<T> int64_;
    qi::rule<It, std::string(T&)> r;
};

int main()
{
    Parser const p;
    using Lim = std::numeric_limits<T>;
    for (std::string const test_str : {
             "<+123>"s,
             "<0123>"s,
             "<0123>"s,
             "<123>"s,
             "<" + std::to_string(Lim::max()) + ">",
             "<" + std::to_string(Lim::min()) + ">",
         })
    {
        std::string value_str;
        int64_t value;

        auto success
            = qi::parse(test_str.begin(), test_str.end(), p(phx::ref(value)), value_str);

        std::cout << "success: " << std::boolalpha  << success << "\n";
        std::cout << "value: "             << value                  << "\n";
        std::cout << "matched substring: " << std::quoted(value_str) << "\n";
    }
}

Printing:

success: true
value: 123
matched substring: "+123"
success: true
value: 123
matched substring: "0123"
success: true
value: 123
matched substring: "0123"
success: true
value: 123
matched substring: "123"
success: true
value: 9223372036854775807
matched substring: "9223372036854775807"
success: true
value: -9223372036854775808
matched substring: "-9223372036854775808"
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Added a BONUS take that encapsulates the parser and replaces the "global" variable with an inherited attribute: http://coliru.stacked-crooked.com/a/261b66785a435acb – sehe Feb 26 '21 at 15:46
  • It actually worked fine for me without the `eps` (though my actual use case is slightly different from what I showed in the question). What was the incorrect behaviour that was occurring without the `eps`? – Brian Bi Feb 26 '21 at 22:07
  • Yeah, it should work without. `raw[]` returns an iterator range to all matched subject. I got "+" instead (repro: http://coliru.stacked-crooked.com/a/10d08baf100f6595) – sehe Feb 26 '21 at 22:09
  • I suspect the bug is in your version of Boost and not mine. My version is older. – Brian Bi Feb 26 '21 at 22:11
  • Hey, here's something I don't quite understand about your example: how does your rule `r` know to propagate the synthesized attribute from the middle operand (the `qi::raw` part)? Sorry if this is a very basic question. I literally started using Boost.Spirit yesterday. – Brian Bi Feb 26 '21 at 22:46
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/229273/discussion-between-sehe-and-brian-bi). – sehe Feb 26 '21 at 22:50
  • Ended up [filing the issue](https://github.com/boostorg/spirit/issues/653), because perhaps it won't be fixed (for backwards compatibility reasons) but because it is important to be aware of this behavior. I think it also informs design choices in X3. – sehe Feb 27 '21 at 00:38