Using Xpressive
You should make the action a lazy actor. Your Data
constructor call isn't.
Live On Coliru
#include <string>
#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>
namespace bex = boost::xpressive;
struct Data {
int integer;
double real;
std::string str;
Data(int integer, double real, std::string str) : integer(integer), real(real), str(str) { }
};
#include <iostream>
int main() {
std::vector<Data> container;
std::string const& input = "Int: 0 - Real: 18.8 - Str: ABC-1005\nInt: 0 - Real: 21.3 - Str: BCD-1006\n";
using namespace bex;
bex::sregex const parser = ("Int: " >> (s1 = _d) >> " - Real: " >> (s2 = (repeat<1,2>(_d) >> '.' >> _d)) >> " - Str: " >> (s3 = +set[alnum | '-']) >> _n)
[bex::ref(container)->*bex::push_back(bex::construct<Data>(as<int>(s1), as<double>(s2), s3))];
bex::sregex_iterator cur(input.begin(), input.end(), parser), end;
for (auto const& what : boost::make_iterator_range(cur, end)) {
std::cout << what.str() << "\n";
}
for(auto& r : container) {
std::cout << "[ " << r.integer << "; " << r.real << "; " << r.str << " ]\n";
}
}
Prints
Int: 0 - Real: 18.8 - Str: ABC-1005
Int: 0 - Real: 21.3 - Str: BCD-1006
[ 0; 18.8; ABC-1005 ]
[ 0; 21.3; BCD-1006 ]
Using Spirit
I'd use spirit for this. Spirit has the primitives to directly parse to underlying data types, which is less error prone and more efficient.
Spirit Qi (V2)
Using Phoenix, it's pretty similar: Live On Coliru
Using Fusion adaptation, it gets more interesting, and a lot simpler:
Live On Coliru
Now imagine:
- You wanted to match the keywords case insensitive
- You wanted to make whitespace insignificant
- You wanted to accept empty lines, but not random data in between
How would you do that in Xpressive? Here's how you'd do it with Spirit. Note how the additional constraints do not change the grammar, essentially. Contrast that with regex-based parsers.
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/struct.hpp>
namespace qi = boost::spirit::qi;
struct Data {
int integer;
double real;
std::string str;
};
BOOST_FUSION_ADAPT_STRUCT(Data, integer, real, str);
#include <iostream>
int main() {
std::vector<Data> container;
using It = std::string::const_iterator;
std::string const& input = "iNT: 0 - Real: 18.8 - Str: ABC-1005\n\nInt: 1-Real:21.3 -sTR:BCD-1006\n\n";
qi::rule<It, Data(), qi::blank_type> parser = qi::no_case[
qi::lit("int") >> ':' >> qi::auto_ >> '-'
>> "real" >> ':' >> qi::auto_ >> '-'
>> "str" >> ':' >> +(qi::alnum|qi::char_('-')) >> +qi::eol
];
It f = input.begin(), l = input.end();
if (parse(f, l, qi::skip(qi::blank)[*parser], container)) {
std::cout << "Parsed:\n";
for(auto& r : container) {
std::cout << "[ " << r.integer << "; " << r.real << "; " << r.str << " ]\n";
}
} else {
std::cout << "Parse failed\n";
}
if (f != l) {
std::cout << "Remaining input: '" << std::string(f,l) << "'\n";
}
}
Still prints
Parsed:
[ 0; 18.8; ABC-1005 ]
[ 1; 21.3; BCD-1006 ]
Further thoughts: how would you
- Parse scientific notation? Negative numbers?
- Parse decimal numbers correctly (assuming you are really parsing financial amounts, you may not wish inexact floating point representations)
Spirit X3
If you can use c++14, Spirit X3 can be more efficient, and compile a lot faster than either the Spirit Qi or the Xpressive approach:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/struct.hpp>
struct Data {
int integer;
double real;
std::string str;
};
BOOST_FUSION_ADAPT_STRUCT(Data, integer, real, str);
namespace Parsers {
using namespace boost::spirit::x3;
static auto const data
= rule<struct Data_, ::Data> {}
= no_case[
lit("int") >> ':' >> int_ >> '-'
>> "real" >> ':' >> double_ >> '-'
>> "str" >> ':' >> +(alnum|char_('-')) >> +eol
];
static auto const datas = skip(blank)[*data];
}
#include <iostream>
int main() {
std::vector<Data> container;
std::string const& input = "iNT: 0 - Real: 18.8 - Str: ABC-1005\n\nInt: 1-Real:21.3 -sTR:BCD-1006\n\n";
auto f = input.begin(), l = input.end();
if (parse(f, l, Parsers::datas, container)) {
std::cout << "Parsed:\n";
for(auto& r : container) {
std::cout << "[ " << r.integer << "; " << r.real << "; " << r.str << " ]\n";
}
} else {
std::cout << "Parse failed\n";
}
if (f != l) {
std::cout << "Remaining input: '" << std::string(f,l) << "'\n";
}
}
Prints (it's getting boring):
Parsed:
[ 0; 18.8; ABC-1005 ]
[ 1; 21.3; BCD-1006 ]