Constraining the existing Boost.Spirit real_parser (with a policy)

Question

I want to parse a float, but not allow NaN values, so I generate a policy which inherits from the default policy and create a real_parser with it:

// using boost::spirit::qi::{real_parser,real_policies,
//                           phrase_parse,double_,char_};

template <typename T>
struct no_nan_policy : real_policies<T>
{
    template <typename I, typename A>
    static bool
    parse_nan(I&, I const&, A&) {
          return false;
    }    
};

real_parser<double, no_nan_policy<double> > no_nan;

// then I can use no_nan to parse, as in the following grammar
bool ok = phrase_parse(first, last, 
   no_nan[ref(valA) = _1] >> char_('@') >> double_[ref(b) = _1],
space);

But now I also want to ensure that the overall length of the string parsed with no_nan does not exceed 4, i.e. "1.23" or ".123" or even "2.e6" or "inf" is ok, "3.2323" is not, nor is "nan". I can not do that in the parse_n/parse_frac_n section of the policy, which separately looks left/right of the dot and can not communicate (...cleanly), which they would have to since the overall length is relevant.

The idea then was to extend real_parser (in boost/spirit/home/qi/numeric/real.hpp) and wrap the parse method -- but this class has no methods. Next to real_parser is the any_real_parser struct which does have parse, but these two structs do not seem to interact in any obvious way.

Is there a way to easily inject my own parse(), do some pre-checks, and then call the real parse (return boost::spirit::qi::any_real_parser<T, RealPolicy>::parse(...)) which then adheres to the given policies? Writing a new parser would be a last-resort method, but I hope there is a better way.

(Using Boost 1.55, i.e. Spirit 2.5.2, with C++11)

Given the set of rules you want for this node, it sounds like it is a mini-language all on its own. How about defining a grammar just for it, then seeing how to [compose grammars](http://stackoverflow.com/questions/17537438/how-can-i-extend-a-boost-spirit-grammar)? — Ami Tavory, May 21 '15 at 14:14
Well as long as it is awesome in some way... Seriously, though, that's an interesting assertion, but it would be nice if you could explain why (esp. the "slow" part). — Ami Tavory, May 21 '15 at 14:23
@AmiTavory: It seems I am so close, i.e. just a few changes to the `double_` parser and I'd be done. This would probably be a lot more maintainable than adding a new grammar, since all the other parsing is done that way. — toting, May 21 '15 at 14:51

score 2 · Accepted Answer · edited May 23 '17 at 11:51

It seems I am so close, i.e. just a few changes to the double_ parser and I'd be done. This would probably be a lot more maintainable than adding a new grammar, since all the other parsing is done that way. – toting 7 hours ago

Even more maintainable would be to not write another parser at all.

You basically want to parse a floating point numbers (Spirit has got you covered) but apply some validations afterward. I'd do the validations in a semantic action:

raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && px::size(_1)<=4 ]

That's it.

Explanations

Anatomy:

double_ [_val = _1] parses a double and assigns it to the exposed attribute as usual¹
raw [ parser ] matches the enclosed parser but exposes the raw source iterator range as an attribute
[ _pass = !isnan_(_val) && px::size(_1)<=4 ] - the business part!

This semantic action attaches to the raw[] parser. Hence
- _1 now refers to the raw iterator range that already parsed the double_
- _val already contains the "cooked" value of a successful match of double_
- _pass is a Spirit context flag that we can set to false to make parsing fail.

Now the only thing left is to tie it all together. Let's make a deferred version of ::isnan:

boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);

We're good to go.

Test Program

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <cmath>
#include <iostream>

int main ()
{
    using It = std::string::const_iterator;

    auto my_fpnumber = [] { // TODO encapsulate in a grammar struct
        using namespace boost::spirit::qi;
        using boost::phoenix::size;

        static boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);

        return rule<It, double()> (
                raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && size(_1)<=4 ]
            );
    }();

    for (std::string const s: { "1.23", ".123", "2.e6", "inf", "3.2323", "nan" })
    {
        It f = s.begin(), l = s.end();

        double result;
        if (parse(f, l, my_fpnumber, result))
            std::cout << "Parse success:  '" << s << "' -> " << result << "\n";
        else
            std::cout << "Parse rejected: '" << s << "' at '" << std::string(f,l) << "'\n";
    }
}

Prints

Parse success:  '1.23' -> 1.23
Parse success:  '.123' -> 0.123
Parse success:  '2.e6' -> 2e+06
Parse success:  'inf' -> inf
Parse rejected: '3.2323' at '3.2323'
Parse rejected: 'nan' at 'nan'

¹ The assignment has to be done explicitly here because we use semantic actions and they normally suppress automatic attribute propagation

@AmiTavory I didn't do a comparison. But consider that [Spirit has the fastest real number parsers around](http://tinodidriksen.com/2011/05/28/cpp-convert-string-to-double-speed/). Now, consider the complexity of writing the grammar you suggested (e.g. [like here](http://stackoverflow.com/q/11568787)). It involves 6 type-erased rules there, and it doesn't even piece the actual value together (good luck getting _that_ correct!). Furthermore, that is not flexible, because you cannot decide whether the separator _must_ be present, whether the integer part can be absent etc. etc. — sehe, May 21 '15 at 23:35
I think a head-to-head test is too much effort and clearly not needed here. If you want you can do a simple test (e.g. without actually calculating the result). — sehe, May 21 '15 at 23:36
How would you apply this same logic to check for bounds (i.e. `qi::_val` is less than `256`)? Naively, I would have expected `raw[qi::uint_[qi::_val = qi::_1]][qi::_pass = qi::_val < 256]` to have worked. — Zak, Jul 10 '18 at 17:39
I have formally asked this same question here: https://stackoverflow.com/questions/51272067/boost-spirit-qi-bounds-checking-against-primitive-data-types — Zak, Jul 10 '18 at 18:40

Constraining the existing Boost.Spirit real_parser (with a policy)

1 Answers1

Explanations

Test Program

Linked

Related