1

I'm trying to store the nested XML file:

<?xml version="1.0" encoding="utf-8"?>
<root>
<type>
<cars>
<car name="Garfield" weight="4Kg">
<spec serial_="e_54" source ="petrol" mileage="56"/>
<spec serial_="e_52" source="diesel" mileage="52"/>
<spec serial_="m_22" source="electric" mileage="51"/>
<additions source="steam" convert="153 0 1 0"/>
</car>
<car name="Awesome" weight="3Kg">
<spec serial_="t_54" source="petrol" mileage="16"/>
<spec serial_="t_52" source="wind" mileage="62"/>
<spec serial_="t_22" source="electric" mileage="81"/>
<additions source="water" convert="123 1 1 0"/>
</car>
</cars>
<planes>
<plane id="231" name ="abc">
<utilities serial_="e456" part="567"/>
</plane>
</type>
</root>

in the from of a boost library un-directed graph. As per the XML, I wish to make each "car" and "plane" a node while storing there attributes as node's data members. Next, their child nodes i.e. "spec", "additions" and "utilities" are to be stored in form of edges along with their attributes.

The code structure is as follow:

#include <boost/property_tree/xml_parser.hpp>
using boost::property_tree::ptree;
#include <iostream>

struct Car {
std::string name, weight;
struct Spec {
    std::string serial_, source;
    double mileage;
};
std::vector<Spec> specs;
};
//Note: struct identical to struct car have been devised to store planes and additions 

static bool parse(ptree const& node, Car::Spec& into) {
into.serial_ = node.get<std::string>("<xmlattr>.serial_");
into.source  = node.get<std::string>("<xmlattr>.source");
into.mileage = node.get<double>("<xmlattr>.mileage");
return true;
}

static bool parse(ptree const& node, Car& into)  {
into.name   = node.get<std::string>("<xmlattr>.name");
into.weight = node.get<std::string>("<xmlattr>.weight");
for (auto& [name, child] : node) {
    if (name == "spec") {
        into.specs.emplace_back();
        if (!parse(child, into.specs.back())) {
            return false;
        }
    }
}
return true;
}

static std::ostream& operator<<(std::ostream& os, Car const& car) {
os << "Name: " << car.name << ", Weight: " << car.weight;
for (auto& spec : car.specs) {
    os << "\n -- [" << spec.serial_ << "; " << spec.source << "; "
       << spec.mileage << "]";
}
return os;
}

int main() 
{
boost::property_tree::ptree pt;
{
    std::ifstream ifs("input.xml");
    read_xml(ifs, pt);
}

for (auto& [key, node] : pt.get_child("root.type.cars")) {
    if ("car" == key) {
        Car car;
        parse(node, car);
        std::cout << car << "\n";
    }
}
}

I wish to get rid of the structs and have a class in-place instead for storage and form the depicted BGL graph.

All help is duly appreciated. Thanks!!

  • "have a class in-place" - what would that look like? I'd say you have that. Literally **nothing** in the code nor the XML refers to graphs, let alone BGL. Can you tell us what you try to achieve instead of firing ["homework style" questions](https://stackoverflow.com/questions/66302889/parsing-a-nested-xml-file-for-its-attributes-using-boost-library-in-c) only to tell us later that that wasn't what you needed. – sehe Feb 22 '21 at 11:50
  • In case this helps: https://en.cppreference.com/book/intro/classes#:~:text=Basically%20a%20class%20is%20the,lies%20in%20the%20usage%2Dconventions. – sehe Feb 22 '21 at 11:52
  • A class is required to store cars and planes i form of nodes. I'm trying to store them in form of a BGL graph as I need to implement some algorithms (Such as shortest path etc.). Furthermore, some additional data is also present in the form of graph, so I cant have this one in form of a property_tree. – user12061536 Feb 22 '21 at 12:02
  • You want to know the shortest path from one car to a plane? – sehe Feb 22 '21 at 12:55
  • "so I cant have this one in form of a property_tree." *YOU* were the one starting about that. *YOU* never posted any code of your own (this code isn't your own). Start out by showing what graph you have. It's **trivial** to parse it into node properties instead, but I can't begin to show you because there's too little info. I'm not going to because I learned from giving my other answer – sehe Feb 22 '21 at 12:56
  • I'm a bit confused why I'm being berated for "manners" here. You didn't upvote or accept my earlier answer, and didn't acknowledge it (or link for context) in this follow-up question. I don't think it's bad manners to point out that asking partially formed questions is proving useless because of moving targets. I never said that answering is "below me". In fact, asking the wrong questions is "below you" if you ask me. – sehe Feb 22 '21 at 17:00
  • My comments (like the [other person's](https://stackoverflow.com/questions/66302889/parsing-a-nested-xml-file-for-its-attributes-using-boost-library-in-c#comment117218319_66302889), by the way) are aimed at helping you realize the problem with the question so that you can save yourself time and trouble. Regardless, I've answered the question below. – sehe Feb 22 '21 at 17:00

1 Answers1

1

Okay, because there's not a lot of other traffic, let's do this.

First, let's fix the input so that it is actually XML:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <type>
    <cars>
      <car name="Garfield" weight="4Kg">
        <spec serial_="e_54" source ="petrol" mileage="56"/>
        <spec serial_="e_52" source="diesel" mileage="52"/>
        <spec serial_="m_22" source="electric" mileage="51"/>
        <additions source="steam" convert="153 0 1 0"/>
      </car>
      <car name="Awesome" weight="3Kg">
        <spec serial_="t_54" source="petrol" mileage="16"/>
        <spec serial_="t_52" source="wind" mileage="62"/>
        <spec serial_="t_22" source="electric" mileage="81"/>
        <additions source="water" convert="123 1 1 0"/>
      </car>
    </cars>
    <planes>
      <plane id="231" name ="abc">
        <utilities serial_="e456" part="567"/>
      </plane>
    </planes>
  </type>
</root>

Now, let's add parsing for the planes:

struct Plane {
    Id id;
    std::string name;
    struct Utilities {
        std::string serial_, part;
    };
    Utilities utilities;
};

static bool parse(ptree const& node, Plane::Utilities& into) {
    into.serial_ = node.get<std::string>("<xmlattr>.serial_");
    into.part    = node.get<std::string>("<xmlattr>.part");
    return true;
}

static bool parse(ptree const& node, Plane& into)  {
    into.id   = node.get<Id>("<xmlattr>.id");
    into.name = node.get<std::string>("<xmlattr>.name");
    if (auto child = node.get_child_optional("utilities")) {
        return parse(*child, into.utilities);
    }
    return true;
}

So far, nothing new. Well, we might add the additions to cars:

struct Additions {
    std::string source;
    std::vector<double> convert;
};
Additions additions;

Which you can parse using something like

static bool parse(ptree const& node, Car::Additions& into) {
    into.source = node.get<std::string>("<xmlattr>.source");
    auto values = node.get<std::string>("<xmlattr>.convert");

    if (!x3::parse(
            values.begin(), values.end(),
            x3::skip(x3::space) [*x3::double_],
            into.convert))
        return false;
    return true;
}

Making It A Graph

Instead of "magically" not having the structs but still having the data (how?) you would probably want to attach the structs to your graph:

using VertexBundle = boost::variant<Car, Plane>;
using EdgeBundle = std::string;

using Graph = boost::adjacency_list<
    boost::vecS, boost::vecS, boost::directedS,
    VertexBundle, EdgeBundle>;

Vertices

There, now let's parse those vertices from the XML:

Graph g;

auto parse_vehicles = [&pt,&g]<typename Type>(auto path, auto key) {
    for (auto& [k, node] : pt.get_child(path)) {
        if (k == key) {
            Type vehicle;
            parse(node, vehicle);

            add_vertex(vehicle, g);
        }
    }
};

parse_vehicles.operator()<Car>("root.type.cars", "car");
parse_vehicles.operator()<Plane>("root.type.planes", "plane");

Note how nice and generic that parse loop already was.

Edges

There's nothing in your question indicating how we get any edge information, so let's just make something up for demo purposes:

// TODO add edges, but there's no information on how to
add_edge(vertex(0, g), vertex(2, g), "This demo edge has no properties", g);
add_edge(vertex(2, g), vertex(1, g), "One more", g);

Now you can print the whole thing as before:

for (Vertex v : boost::make_iterator_range(vertices(g))) {
    std::cout << g[v] << "\n";
}

Printing Live On Coliru

Name: Garfield, Weight: 4Kg
 -- [e_54; petrol; 56]
 -- [e_52; diesel; 52]
 -- [m_22; electric; 51]
 -- additions [steam; 153/0/1/0]
Name: Awesome, Weight: 3Kg
 -- [t_54; petrol; 16]
 -- [t_52; wind; 62]
 -- [t_22; electric; 81]
 -- additions [water; 123/1/1/0]
Id: 231, Name: abc
 -- utilities [e456; 567]

As a bonus let's include a DOT graph output:

enter image description here

Full Live Demo

Live On Coliru

#include <boost/property_tree/xml_parser.hpp>
#include <boost/variant.hpp>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graphviz.hpp>
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
using boost::property_tree::ptree;
#include <iostream>

using Id = std::uint32_t;

struct Car {
    std::string name, weight;
    struct Spec {
        std::string serial_, source;
        double mileage;
    };
    struct Additions {
        std::string source;
        std::vector<double> convert;
    };
    std::vector<Spec> specs;
    Additions additions;
};

struct Plane {
    Id id;
    std::string name;
    struct Utilities {
        std::string serial_, part;
    };
    Utilities utilities;
};

static bool parse(ptree const& node, Car::Additions& into) {
    into.source = node.get<std::string>("<xmlattr>.source");
    auto values = node.get<std::string>("<xmlattr>.convert");

    if (!x3::parse(
            values.begin(), values.end(),
            x3::skip(x3::space) [*x3::double_],
            into.convert))
        return false;
    return true;
}

static bool parse(ptree const& node, Car::Spec& into) {
    into.serial_ = node.get<std::string>("<xmlattr>.serial_");
    into.source  = node.get<std::string>("<xmlattr>.source");
    into.mileage = node.get<double>("<xmlattr>.mileage");
    into.mileage = node.get<double>("<xmlattr>.mileage");
    return true;
}

static bool parse(ptree const& node, Car& into)  {
    into.name   = node.get<std::string>("<xmlattr>.name");
    into.weight = node.get<std::string>("<xmlattr>.weight");
    for (auto& [name, child] : node) {
        if (name == "spec") {
            into.specs.emplace_back();
            if (!parse(child, into.specs.back())) {
                return false;
            }
        }
    }
    if (auto child = node.get_child_optional("additions")) {
        return parse(*child, into.additions);
    }
    return true;
}

static bool parse(ptree const& node, Plane::Utilities& into) {
    into.serial_ = node.get<std::string>("<xmlattr>.serial_");
    into.part    = node.get<std::string>("<xmlattr>.part");
    return true;
}

static bool parse(ptree const& node, Plane& into)  {
    into.id   = node.get<Id>("<xmlattr>.id");
    into.name = node.get<std::string>("<xmlattr>.name");
    if (auto child = node.get_child_optional("utilities")) {
        return parse(*child, into.utilities);
    }
    return true;
}

static std::ostream& operator<<(std::ostream& os, Car const& car) {
    os << "Name: " << car.name << ", Weight: " << car.weight;
    for (auto& spec : car.specs) {
        os << "\n -- [" << spec.serial_ << "; " << spec.source << "; "
           << spec.mileage << "]";
    }
    auto& a = car.additions;
    if (!(a.source.empty() && a.convert.empty())) {
        os << "\n -- additions [" << a.source << ";";
        auto sep = ' ';
        for (auto d : a.convert) {
            os << std::exchange(sep, '/') << d;
        }
        os << "]";
    }
    return os;
}

static std::ostream& operator<<(std::ostream& os, Plane const& plane) {
    os << "Id: " << plane.id << ", Name: " << plane.name;
    auto& u = plane.utilities;
    if (!(u.serial_.empty() && u.part.empty())) {
        os << "\n -- utilities [" << u.serial_ << "; " << u.part << "]";
    }
    return os;
}

using VertexBundle = boost::variant<Car, Plane>;
using EdgeBundle = std::string;
using Graph = boost::adjacency_list<
    boost::vecS, boost::vecS, boost::directedS,
    VertexBundle, EdgeBundle>;

using Vertex = Graph::vertex_descriptor;
using Edge   = Graph::edge_descriptor;

int main() 
{
    boost::property_tree::ptree pt;
    {
        std::ifstream ifs("input.xml");
        read_xml(ifs, pt);
    }

    Graph g;

    auto parse_vehicles = [&pt,&g]<typename Type>(auto path, auto key) {
        for (auto& [k, node] : pt.get_child(path)) {
            if (k == key) {
                Type vehicle;
                parse(node, vehicle);

                add_vertex(vehicle, g);
            }
        }
    };

    parse_vehicles.operator()<Car>("root.type.cars", "car");
    parse_vehicles.operator()<Plane>("root.type.planes", "plane");

    // TODO add edges, but there's no information on how to
    add_edge(vertex(0, g), vertex(2, g), "This demo edge has no properties", g);
    add_edge(vertex(2, g), vertex(1, g), "One more", g);

    for (Vertex v : boost::make_iterator_range(vertices(g))) {
        std::cout << g[v] << "\n";
    }

    {
        auto vindex     = get(boost::vertex_index, g);
        auto calc_color = [&](Vertex v) { return g[v].which()? "red":"blue"; };
        auto calc_label = [&](Vertex v) {
            // multiline Mrecord label formatting
            auto txt = boost::lexical_cast<std::string>(g[v]);
            boost::algorithm::replace_all(txt, "\n --", "|");
            return "{" + txt + "}";
        };

        boost::dynamic_properties dp;
        dp.property("node_id",   vindex);
        dp.property("label",     boost::make_transform_value_property_map(calc_label, vindex));
        dp.property("fontcolor", boost::make_transform_value_property_map(calc_color, vindex));
        dp.property("style",     boost::make_static_property_map<Vertex>(std::string("filled")));
        dp.property("label",     get(boost::edge_bundle, g));

        auto pw = boost::dynamic_vertex_properties_writer { dp, "node_id" };
        using Map = std::map<std::string, std::string>;
        auto gpw = boost::make_graph_attributes_writer(Map{}, Map {{"shape", "Mrecord"}}, Map{});

        std::ofstream ofs("graph.dot");
        write_graphviz(ofs, g, pw, pw, gpw);
    }
}

Prints the output shown above, as well as the following graph.dot:

digraph G {
node [
shape=Mrecord];
0 [fontcolor=blue, label="{Name: Garfield, Weight: 4Kg| [e_54; petrol; 56]| [e_52; diesel; 52]| [m_22; electric; 51]| additions [steam; 153/0/1/0]}", style=filled];
1 [fontcolor=blue, label="{Name: Awesome, Weight: 3Kg| [t_54; petrol; 16]| [t_52; wind; 62]| [t_22; electric; 81]| additions [water; 123/1/1/0]}", style=filled];
2 [fontcolor=red, label="{Id: 231, Name: abc| utilities [e456; 567]}", style=filled];
0->2  [label="This demo edge has no properties"];
2->1  [label="One more"];
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Online DOT graph rendering: https://cutt.ly/BljyXKf – sehe Feb 22 '21 at 17:04
  • Hi @sehe , thanks again for all the help and feedback. Took me a while to understand your code completely. Really well written and definitely elegant. – user12061536 Feb 25 '21 at 19:56
  • I'm still trying to figure out this part of question: ""Next, their child nodes i.e. "spec", "additions" and "utilities" are to be stored in form of edges along with their attributes"". I know they will be added in form of edge properties. I've added the struct in EdgeBundle as well for the same. Could you highlight how can I add them as edges if an edge needs to be added between each vertex whose attribute "source" are the same. I tries iterating via vertex list but I'm unable to access vertex individual ember via . or -> method as shown in the printing function. – user12061536 Feb 25 '21 at 20:03
  • I have no idea how I could create edges based on the XML. If you can give me any example edge from example XML I might understand. – sehe Feb 25 '21 at 20:51
  • For this xml, if spec.source = something, than edge needs to be added from between with source as indicated in spec and dest. as indicated in car – user12061536 Feb 25 '21 at 21:04
  • when I try to access say: g[v].name (i.e. "name" member of struct Car) with an boost iterator. The error says no member named 'name'. Cant figure out why. – user12061536 Feb 25 '21 at 21:10
  • Please. _"needs to be added from between with source as indicated in spec"_? I don't understand any of this. What is an example edge? I understand that there can be a language barrier at play. Please just state *examples*. Then, maybe add notes about them. An example would be "The spec "e_77" is an edge between car "CA 1" and plane "XYZ", with the properties .... from the .... attributes" – sehe Feb 25 '21 at 21:15
  • Sorry again for the inconvenience. For above xml, spec.source = "AB 8" and car.name = "AB 5" with latter as destination node and former as source node. Similarly, spec.source = "AB 1" and car.name = "CA 1" with latter as destination node and former as source node. – user12061536 Feb 25 '21 at 21:23
  • So, the match is either on "AB" == "AB" (disregarding the different number) OR on 1 == 1 (disregarding the different letter code)? It feels like "source" is not actually 1 field, but a list of "something" (ids?) that would actually need to be separately stored. (Similar problems exist with "4Kg" en "CA 4l" (is that a typo for "CA 41"?). – sehe Feb 25 '21 at 21:50
  • So now we know a (very contrived) recipe to detected _implicit_ edges. But what properties are part of the edges? Or is each "spec" element _explicitly_ an edge (only) and those properties all belonged with the edge? Is `serial_` like an "edge ID" in the XML? What's the semantic difference between "e_XX" and "t_XX" serials? – sehe Feb 25 '21 at 21:53
  • Yes, each spec element is an edge. It can be considered as edge_properties in context of bgl. In addition, one needs to allow parallel edges as follow: using allow_parallel_edge_tag = boost::graph_traits::edge_parallel_category; as multiple specs exist. – user12061536 Feb 25 '21 at 22:00
  • I changed VertexBundle with struct Car directly in the graph type declaration and the error has been removed. Will try the same for the edges. – user12061536 Feb 25 '21 at 22:28
  • "using allow_parallel_edge_tag = boost::graph_traits::edge_parallel_category" has nothing to do with it. Also, parallel edges don't mean "multiple edges exist". It means "multiple edges between the same vertices exist, with the same direction" – sehe Feb 25 '21 at 22:46
  • "I changed VertexBundle with struct Car" - Ah, you didn't know how variants work. See https://www.boost.org/doc/libs/1_75_0/doc/html/variant.html#variant.intro or https://en.cppreference.com/w/cpp/utility/variant. You need the variant since some of your nodes aren't cars. – sehe Feb 25 '21 at 22:48
  • You didn't answer anything about my questiions about matching the sources to partial names. I just got an idea, did you mean "AB 5" is connected to "AB 8" (even though no such vehicle exists in your sample XML?). Then I think I'm starting to get it. Good examples are important.... – sehe Feb 25 '21 at 22:54
  • So, here's that idea creating specs as edges after all nodes (keeping an index `by_name`): **[Live](http://coliru.stacked-crooked.com/a/2d766af2a79a20d2)**. Image: https://i.imgur.com/7a3gWOl.png. (note the additions to the [input.xml](http://coliru.stacked-crooked.com/a/66d89958da218567)) – sehe Feb 25 '21 at 23:59
  • Variant handling can be made more elegant using [visitors](https://www.boost.org/doc/libs/1_75_0/doc/html/variant/tutorial.html#boost-common-heading-doc-spacer:~:text=For%20this%20reason%2C%20variant%20supports%20compile%2Dtime%20checked%20visitation%20via%20apply_visitor): http://coliru.stacked-crooked.com/a/b34bcac6e55933e6 – sehe Feb 26 '21 at 00:15
  • Yes, the image depicts the idea behind edges correctly. – user12061536 Feb 26 '21 at 05:40