1

I am currently starting with boost::spirit::*. I try to parse a 128 bit string into a simple c array with corresponding size. I created a short test which does the job:

    boost::spirit::qi::int_parser< boost::uint8_t, 16, 2, 2 > uint8_hex;
    std::string src( "00112233445566778899aabbccddeeff" );
    boost::uint8_t dst[ 16 ];

    bool r;
    for( std::size_t i = 0; i < 16; ++i )
    {
        r = boost::spirit::qi::parse( src.begin( ) + 2 * i, src.begin( ) + 2 * i + 2, uint8_hex, dst[ i ] );
    }

I have the feeling that this is not the smartest way to do it :) Any ideas how to define a rule so I can avoid the loop ?

Update:

In the meantime I figured out the following code which does the job very well:

    using namespace boost::spirit;
    using namespace boost::phoenix;

    qi::int_parser< boost::uint8_t, 16, 2, 2 > uint8_hex;

    std::string src( "00112233445566778899aabbccddeeff" );

    boost::uint8_t dst[ 16 ];
    std::size_t i = 0;

    bool r = qi::parse( src.begin( ),
                        src.end( ),
                        qi::repeat( 16 )[ uint8_hex[ ref( dst )[ ref( i )++ ] = qi::_1 ] ] );
Maik
  • 541
  • 4
  • 15

2 Answers2

2

Not literally staying with the question, if you really wanted just to parse the hexadecimal representation of a 128 bit integer, you can do so portably by using uint128_t defined in Boost Multiprecision:

qi::int_parser<uint128_t, 16, 16, 16> uint128_hex;

uint128_t parsed;
bool r = qi::parse(f, l, uint128_hex, parsed);

This is bound to be the quickest way especially on platforms where 128bit types are supported in the instruction set.

Live On Coliru

#include <boost/multiprecision/cpp_int.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi  = boost::spirit::qi;

int main() {
    using boost::multiprecision::uint128_t;
    using It = std::string::const_iterator;
    qi::int_parser<uint128_t, 16, 16, 16> uint128_hex;

    std::string const src("00112233445566778899aabbccddeeff");
    auto f(src.begin()), l(src.end());

    uint128_t parsed;
    bool r = qi::parse(f, l, uint128_hex, parsed);

    if (r) std::cout << "Parse succeeded: " << std::hex << std::showbase << parsed << "\n";
    else   std::cout << "Parse failed at '" << std::string(f,l) << "'\n";

}
sehe
  • 374,641
  • 47
  • 450
  • 633
0

There's a sad combination of factors that lead to this being a painful edge case

  • Boost Fusion can adapt (boost::)array<> but it it requires the parser to result in a tuple of elements, not a container
  • Boost Fusion can adapt these sequences, but need to be configure to allow 16 elements:

    #define FUSION_MAX_VECTOR_SIZE 16
    
  • Even when you do, the qi::repeat(n)[] parser directive expects the attribute to be a container type.

You might work around all this in an ugly way (e.g. Live On Coliru). This makes everything hard to work with down the road.

I'd prefer a tiny semantic action here to make the result being assigned from qi::repeat(n)[]:

    using data_t = boost::array<uint8_t, 16>;
    data_t dst {};

        qi::rule<It, data_t(), qi::locals<data_t::iterator> > rule = 
            qi::eps [ qi::_a = phx::begin(qi::_val) ]
            >> qi::repeat(16) [
                    uint8_hex [ *qi::_a++ = qi::_1 ]
            ];

This works without too much noise. The idea is to take the start iterator and write to the next element each iteraton.

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi  = boost::spirit::qi;
namespace phx = boost::phoenix;

int main() {
    using It = std::string::const_iterator;
    qi::int_parser<uint8_t, 16, 2, 2> uint8_hex;

    std::string const src("00112233445566778899aabbccddeeff");
    auto f(src.begin()), l(src.end());

    using data_t = boost::array<uint8_t, 16>;
    data_t dst {};

        qi::rule<It, data_t(), qi::locals<data_t::iterator> > rule = 
            qi::eps [ qi::_a = phx::begin(qi::_val) ]
            >> qi::repeat(16) [
                    uint8_hex [ *qi::_a++ = qi::_1 ]
            ];

    bool r = qi::parse(f, l, rule, dst);

    if (r) {
        std::cout << "Parse succeeded\n";

        for(unsigned i : dst) std::cout << std::hex << std::showbase << i << " ";
        std::cout << "\n";
    } else {
        std::cout << "Parse failed at '" << std::string(f,l) << "'\n";
    }
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • I did not see your answer before. Thanks for that. In the meantime I updated my question and proposed a different solution. Because your solution is quite different I would like to ask what is the advantage of your solution or what is the disadvantage of mine ? Thanks. – Maik Jan 22 '15 at 18:17
  • 2
    Erm. What? You just told me I wasted my time answering this, and now you ask me to waste more time analyzing your own code to see what is the difference? (a) I have a hard time believing you can't do that yourself since you are clearly capable of solving the more advanced tasks with Spirit (b) I have no clue why you apparently don't know what the difference is, yet assume that my solution must be better? Maybe my code doesn't have an advantage. In short: None of this makes sense. – sehe Jan 22 '15 at 22:25
  • On topic: It looks like your solution is essentially the same. I see three main benefits: **A.** my code uses a iterator which (a1.) makes the semantic action more succinct (a2.) make the code slightly more generic (it still works for any output iterator), **B.** it uses a local which makes it easier to compose (try doing [**`qi::phrase_parse(f, l, rule % ';', qi::space, dst)`**](http://coliru.stacked-crooked.com/a/9afd160f658f8e29) e.g.) **C.** _(subjective:)_ might be slightly more readable, perhaps because it favours using static plumbing over using extraneous moving parts (`i` and `dst`) – sehe Jan 22 '15 at 22:29