I have been browsing posts on this site, the library's documentation, and discussions and explanations on other sites. However, I am struggling to understand how the C++ Boost Graph library works in my application. My questions are highlighted in bold. I list them chronologically regarding what part of the function they refer to. The bottom question is the most essential one though.
Consider the following example using C++ 17:
#include <vector>
#include <boost/config.hpp>
#include <boost/graph/graph_traits.hpp>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/dijkstra_shortest_paths.hpp>
#include <boost/property_map/property_map.hpp>
typedef boost::adjacency_list <boost::listS, boost::vecS, boost::directedS, boost::no_property, boost::property<boost::edge_weight_t, double> > graph_t;
typedef boost::graph_traits<graph_t>::vertex_descriptor vertex_descriptor;
void example_paths(std::vector<int>& from, std::vector<int>& to, std::vector<double>& weights, int source, std::vector<int>& targets) {
}
The input vectors from
, to
, and weights
have around 2.2 billion elements each and denote the edges of the graph. from
hosts the origin IDs, to
the destination IDs, and weights
the edge weights. I.e. the first edge goes from vertex from[0]
to vertex to[0]
and has weight weights[0]
. This data creates a graph with around 250 million vertices. Almost all vertices are connected to the main graph. Strictly speaking, it is thus a multi-graph, are however a graph made up of multiple graphs is called in this library. From that graph, I want to compute the shortest paths from the vertex with the ID source
to the vertices with the IDs targets
.
In my code, I follow the documentation's example and define the OutEdgeList
as a boost::listS
and the VertexList
as a boost::vecS
. I read that the types (boost::vecS
, boost::listS
, boost::slistS
, boost::setS
, boost::multisetS
, and boost::hash_setS
) available for OutEdgeList
and VertexList
differ in terms of computational efficiency. As I have to consider the total computational efficiency, taking both the graph construction from the input vectors and the application of Dijkstra's algorithm into account, I do not know which type I should choose. What types am I supposed to use for OutEdgeList
and VertexList
?
That brings me to my second question. I have found two drastically different methods of adjacency list construction:
graph_t g;
for(int i = 0; i < from.size(); i++) {
boost::add_edge (from[i], to[i], weights[i], g);
}
and
typedef std::pair< int, int > Edge;
const int num_nodes = 5;
enum nodes { A, B, C, D, E };
char name[] = "ABCDE";
Edge edge_array[] = { Edge(A, C), Edge(B, B), Edge(B, D), Edge(B, E),
Edge(C, B), Edge(C, D), Edge(D, E), Edge(E, A), Edge(E, B) };
int weights[] = { 1, 2, 1, 2, 7, 3, 1, 1, 1 };
int num_arcs = sizeof(edge_array) / sizeof(Edge);
graph_t g(edge_array, edge_array + num_arcs, weights, num_nodes);
where the second approach is taken from the documentation and not adjusted to my application. Which approach is more efficient? If it is the second one, what would be the translation for my example? And is it a problem that I use numeric instead of character IDs?
In general, I want to implement three variants of this shortest path function. The first variant only obtains the vertex IDs along the shortest paths. I.e. it generates a std::vector<std::vector<int> >
where the first element is a vector from source
to targets[0]
etc. The second variant only returns the distances, i.e. a std::vector<double>
where the first element is the sum of edge weights along the shortest path between source
and targets[0]
etc. The third variant returns the output of both the other variants. From what I understand, calculating the shortest paths works something like this
std::vector<vertex_descriptor> p(num_vertices(g));
std::vector<double> d(num_vertices(g));
vertex_descriptor s = vertex(1, g);
std::set<vertex_descriptor> t = {vertex(5, g), vertex(7, g)};
boost::dijkstra_shortest_paths(g, s, boost::predecessor_map(&p[0]).distance_map(&d[0]).visitor(target_visit(targets, on_examine_vertex())));
where I can add or leave out parts, depending on whether I want to extract the distances or the vertex IDs along the paths. Though, I do not really understand how it works. How do I call boost::dijkstra_shortest_paths
to obtain the std::vector
s in each of the three function variants described above?
I am sorry for this lengthy post. I suppose that these issues are intuitive for someone experienced with this library, but I have been stuck on them for days. I am aware that some people on this platform are sensitive to questions i.a. asking about efficiency, but dealing with billions of edges requires computational efficiency to be a primary consideration.
Please do not hesitate to ask questions in case parts of my post require clarification.
Thanks in advance.