4

This question is a follow-up to "Iterative update of abstract syntax tree with boost spirit".

Known:

  • Parser grammar allows recursion

The requirements are:

  • The AST for the parser must be a BGL graph.
  • Input can be one-to-many symbols per parser step

Ideas:

  • Some basic ideas about spirit parsing into a BGL graph is show here Using boost graph library: how to create a graph..., but does not fully meets the requirements, as I want to be able to parse one-to-many symbols iteratively.
  • Guess the BGL graph and spirit parser must know something about each other in order to populate data at the right spot. First thought is that the parser must be able to work on graph vertices.
  • Solution such as Using semantic actions/qi::locals might be applicable, but I am not sure if that is enough to allow the parser to work on the graph iteratively.

Do any one have some ideas how to solve this, or point me in some direction?

Thanks

Community
  • 1
  • 1
Dev Dev
  • 105
  • 6
  • 1
    I'd suggest you include a SSCCE this time. If it's a follow up, I'm really hesitant to invent a new grammar, input data, a graph type and a compelling demo from scratch :) – sehe Feb 06 '15 at 09:44
  • Understandable. I will try – Dev Dev Feb 06 '15 at 09:50
  • I am still working on a solution... – Dev Dev Feb 23 '15 at 12:50
  • Good :) I'm still here – sehe Feb 23 '15 at 12:51
  • Since you posted your answer, things started to look a lot simpler. It looks like you no longer require "The AST for the parser must be a BGL graph" - but rather it could be "the parser drives a function that operates on an BGL graph". In this vein I've [posted a new answer](http://stackoverflow.com/a/29176101/85371) – sehe Mar 20 '15 at 21:32

2 Answers2

1

In short: This is very difficult to accomplish.

If one wants to parse some data into a BGL graph, the easiest way might be to use the BGL graph adaptor vector_as_graph. Given this adaptor and given that the boost spirit parser can parse into a vector, then it is possible.

enum { r, s, t, u, v, w, x, y, N };
char name[] = "rstuvwxy";
typedef std::vector < std::list < int > > Graph;
Graph g(N);
g[r].push_back(v);
g[s].push_back(r);
g[s].push_back(r);
g[s].push_back(w);
g[t].push_back(x);
g[u].push_back(t);
g[w].push_back(t);
g[w].push_back(x);
g[x].push_back(y);
g[y].push_back(u);
boost::print_graph(g, name);

The solution I have used is to parse the given grammar in the boost spirit domain, and then perform transformations to the AST in order to be represented as a BGL graph.

Final thoughts: It should be noted that the learning curve of both the boost spirit and BGL domain is fairly steep. Based on this, the readability and implementation is easier if keep apart. Not saying it cant be solved.

sehe
  • 374,641
  • 47
  • 450
  • 633
Dev Dev
  • 105
  • 6
1

Responding to your

Final thoughts: It should be noted that the learning curve of both the boost spirit and BGL domain is fairly steep. Based on this, the readability and implementation is easier if keep apart. Not saying it cant be solved.

in your answer, I agree. Key to keeping things simple is to separate concerns. (Just because Phoenix lets you tie in almost anything right into your parser, doesn't mean it's a good idea[1].)

Maybe this sample application can serve as inspiration?

Live On Coliru

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <sstream>

using namespace boost;
typedef adjacency_list<setS, vecS, directedS> Graph;

std::vector<std::string> generate_file_data();
void merge_into(std::istream&& is, Graph& into);

int main() {
    Graph merged;

    for(auto& input: generate_file_data())         
        merge_into(std::istringstream(input), merged);

    print_graph(merged);
}

As you can see, it reads a number of (different) files and merges the edges into the merged graph.

Of course, in an online demo we don't have the input files, so I generate 5 random graphs of 6 random edges among 20 vertices (generate_file_data()).

The interesting bit is in merge_into:

#include <boost/spirit/include/qi.hpp>

void merge_into(std::istream&& is, Graph& into) {
    namespace qi  = boost::spirit::qi;
    namespace phx = boost::phoenix;

    phx::function<edge_merge_f> merge_edges;
    boost::spirit::istream_iterator f(is >> std::noskipws), l;

    bool ok = qi::phrase_parse(f, l, 
            (qi::int_ >> *qi::int_) [ merge_edges(qi::_1, qi::_2, phx::ref(into)) ] % qi::eol,
            qi::blank);

    assert(ok);
}

This is the only piece of the code that deals with spirit. As you can see, it doesn't actually know about BGL at all. That logic is left to a "callback" function object edge_merge_f, which is also pretty trivial:

struct edge_merge_f {
    template <typename...> struct result { typedef void type; }; // for BOOST_RESULT_OF_USE_TR1:

    template<typename V, typename Edges, typename G>
        void operator()(V const& v, Edges const& edges, G& into) const {
            for (auto e : edges)
                if (!add_edge(v, e, into).second)
                    std::cout << "Duplicate edge: (" << v << "," << e << ")\n";
        }
};

UPDATE As a bonus, a version that completely decouples the merge_adjacencies from the Graph type by passing the EdgeCallback to it as a boost::function: Decoupled version. Of course this will be less efficient, but it shows what I mean by keeping things separate.

So there you have it. Here's the output on my machine (I don't seed the random engine so it's repeatable): Live On Coliru

Duplicate edge: (9,1)
Duplicate edge: (0,4)
0 --> 1 2 4 9 
1 --> 
2 --> 
3 --> 1 7 
4 --> 1 
5 --> 1 6 
6 --> 1 3 5 
7 --> 5 6 8 
8 --> 3 
9 --> 1 2 

If you change the Graph typedef to use vecS for the edge container, you don't have to change anything else, and the result is instead: Live On Coliru

0 --> 4 9 1 4 2 
1 --> 
2 --> 
3 --> 7 1 
4 --> 1 
5 --> 1 6 
6 --> 1 3 5 
7 --> 5 6 8 
8 --> 3 
9 --> 1 2 1 

[1] I suppose I linked you to Boost Spirit: "Semantic actions are evil"? before :)

Full Listing

For posterity (also includes generate_file_data()):

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <sstream>

using namespace boost;
typedef adjacency_list<setS/*vecS*/, vecS, directedS> Graph;

std::vector<std::string> generate_file_data();
void merge_into(std::istream&& is, Graph& into);

int main() {
    Graph merged;

    for(auto& input: generate_file_data())         
        merge_into(std::istringstream(input), merged);

    print_graph(merged);
}

////////////////////////////////
// Generating random test data
#include <boost/graph/random.hpp>
#include <boost/random.hpp>

std::vector<std::string> generate_file_data() {
    std::vector<std::string> data;

    mt19937 prng(42);
    for (int i=0; i<5; ++i) {
        std::ostringstream oss;

        Graph g;
        generate_random_graph(g, 10, 4, prng);

        for (auto v : make_iterator_range(vertices(g))) {
            oss << v << " ";
            for (auto const& e : make_iterator_range(out_edges(v, g))) oss << target(e, g) << " ";
            oss << "\n";
        }

        data.push_back(oss.str());
    }

    return data;
}

////////////////////////////////
// Merging edges
namespace {
    struct edge_merge_f {
        template <typename...> struct result { typedef void type; }; // for BOOST_RESULT_OF_USE_TR1:

        template<typename V, typename Edges, typename G>
            void operator()(V const& v, Edges const& edges, G& into) const {
                for (auto e : edges)
                    if (!add_edge(v, e, into).second)
                        std::cout << "Duplicate edge: (" << v << "," << e << ")\n";
            }
    };
}

#include <boost/spirit/include/qi.hpp>

void merge_into(std::istream&& is, Graph& into) {
    namespace qi  = boost::spirit::qi;
    namespace phx = boost::phoenix;

    phx::function<edge_merge_f> merge_edges;
    boost::spirit::istream_iterator f(is >> std::noskipws), l;

    bool ok = qi::phrase_parse(f, l, 
            (qi::int_ >> *qi::int_) [ merge_edges(qi::_1, qi::_2, phx::ref(into)) ] % qi::eol,
            qi::blank);

    assert(ok);
}
Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
  • PS. right now I've used /the/ simples grammar I could think of, but of course, that topic can be dealt with elsewhere. (Such as, e.g. [here](http://stackoverflow.com/a/28310552/85371) before) – sehe Mar 20 '15 at 21:33
  • As a bonus, a version that completely decouples the `merge_adjacencies` from the Graph type by passing the `EdgeCallback` to it as a `boost::function`: **[Decoupled version](http://coliru.stacked-crooked.com/a/d1c4e408cdb85680)**. Of course this will be less efficient, but it shows what I mean by keeping things separate. – sehe Mar 20 '15 at 21:53