3

I have an application where for a given fixed number of vertices, there is a need to solve large number of different max-flow algorithms from a given fixed source (S) to a given fixed sink (T). Each max-flow problem differs in that the directed arcs themselves change along with their capacities. As an example, see below.

enter image description here

The number of vertices remains fixed, but the actual arcs and their capacities differ from one problem to the next.

I have the following code that solves the max-flow problem iteratively for Graph 1 and Graph 2 in the figure above using boost thus (apologies for the wall of text, I have tried to make it as minimal as possible. The code below fully compiles on g++ on my linux box, but I am unable to have this correcly compile on online compilers such as wandbox, etc.):

#include <boost/config.hpp>
#include <iostream>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/boykov_kolmogorov_max_flow.hpp>

using namespace boost;
typedef adjacency_list_traits<vecS, vecS, directedS> Traits;
typedef adjacency_list<
    vecS, vecS, directedS,
    property<
    vertex_name_t, std::string,
    property<vertex_index_t, int,
    property<vertex_color_t, boost::default_color_type,
    property<vertex_distance_t, double,
    property<vertex_predecessor_t, Traits::edge_descriptor>
    > > > >,
    property<
    edge_index_t, int,
    property<edge_capacity_t, double,
    property<edge_weight_t, double,
    property<edge_residual_capacity_t, double,
    property<edge_reverse_t, Traits::edge_descriptor>
> > > > >
Graph;

Graph g;
property_map<Graph, edge_index_t>::type e;
property_map<Graph, edge_capacity_t>::type cap;
property_map<Graph, edge_weight_t>::type cost;
property_map<Graph, edge_residual_capacity_t>::type rescap;
property_map<Graph, edge_reverse_t>::type rev;
property_map<Graph, vertex_color_t>::type colors;

void initialize(int nnodes) {
    e = get(edge_index, g);
    cap = get(edge_capacity, g);
    cost = get(edge_weight, g);
    rescap = get(edge_residual_capacity, g);
    rev = get(edge_reverse, g);
    colors = get(vertex_color, g);
    for(int i = 0; i < nnodes; i++)
        add_vertex(g);
}

void clearedges() {
    Graph::vertex_iterator v, vend;
    for (boost::tie(v, vend) = vertices(g); v != vend; ++v) 
        boost::clear_out_edges(*v, g);
}

void createedges(std::vector<std::pair<int, int>>& arcs, std::vector<double>& capacity) {
    Traits::edge_descriptor edf, edr;//forward and reverse
    for (int eindex = 0, sz = static_cast<int>(arcs.size()); eindex < sz; eindex++) {
        int fr, to;
        fr = arcs[eindex].first;
        to = arcs[eindex].second;
        edf = add_edge(fr, to, g).first;
        edr = add_edge(to, fr, g).first;
        e[edf] = 2 * eindex;
        e[edr] = e[edf] + 1;
        cap[edf] = capacity[eindex];
        cap[edr] = capacity[eindex];
        rev[edf] = edr;
        rev[edr] = edf;
    }
}

double solve_max_flow(int s, int t) {
    double retval = boykov_kolmogorov_max_flow(g, s, t);
    return retval;
}

bool is_part_of_source(int i) {
    if (colors[i] == boost::black_color)
        return true;
    return false;
}

int main() {
    initialize(6);
    std::vector<std::pair<int, int>> arcs1 = { std::make_pair<int,int>(0,1),
    std::make_pair<int,int>(0,2),
    std::make_pair<int,int>(1,2),
    std::make_pair<int,int>(1,3),
    std::make_pair<int,int>(1,4),
    std::make_pair<int,int>(2,4),
    std::make_pair<int,int>(3,4),
    std::make_pair<int,int>(3,5),
    std::make_pair<int,int>(4,5)
    };
    std::vector<double> capacities1 = { 10, 10, 10, 10, 1, 4, 3, 2, 10 };
    clearedges();
    createedges(arcs1, capacities1);
    double maxflow = solve_max_flow(0, 5);
    printf("max flow is %f\n", maxflow);
    for (int i = 0; i < 6; i++)
        if (is_part_of_source(i))
            printf("Node %d belongs to subset source is in\n", i);
    Graph::edge_iterator e_, eend_;
    int Eindex = 0;
    for (boost::tie(e_, eend_) = edges(g); e_ != eend_; ++e_) {
        int fr = source(*e_, g);
        int to = target(*e_, g);
        printf("(%d) Edge %d: (%d -> %d), capacity %f\n", Eindex, e[*e_], fr, to, cap[*e_]);
        Eindex++;
        if (is_part_of_source(fr) && is_part_of_source(to) == false)
            printf("----is part of ST Cut-----\n");
        else
            printf("x\n");
    }
    std::vector<std::pair<int, int>> arcs2 = { std::make_pair<int,int>(0,1),
    std::make_pair<int,int>(0,2),
    std::make_pair<int,int>(1,3),
    std::make_pair<int,int>(2,4),
    std::make_pair<int,int>(3,5),
    std::make_pair<int,int>(4,5)
    };
    std::vector<double> capacities2 = { 10, 10, 10, 4, 2, 0 };
    clearedges();
    createedges(arcs2, capacities2);
    maxflow = solve_max_flow(0, 5);
    printf("max flow is %f\n", maxflow);
    for (int i = 0; i < 6; i++)
        if (is_part_of_source(i))
            printf("Node %d belongs to subset source is in\n", i);
    Eindex = 0;
    for (boost::tie(e_, eend_) = edges(g); e_ != eend_; ++e_) {
        int fr = source(*e_, g);
        int to = target(*e_, g);
        printf("(%d) Edge %d: (%d -> %d), capacity %f\n", Eindex, e[*e_], fr, to, cap[*e_]);
        Eindex++;
        if (is_part_of_source(fr) && is_part_of_source(to) == false)
            printf("----is part of ST Cut-----\n");
        else
            printf("x\n");
    }
    getchar();
}

I have the following questions.

(a) If the underlying vertices remain fixed, but only the arcs and their capacities change from iteration to iteration, is there anything faster than using clear_out_edges to clear the arcs and then using add_edge to add the new arcs with their new capacities? Also, does clear_out_edges correctly also clear the property map entries that may have the edge descriptor just deleted as key?

(b) Boost max-flow algorithms seem to want the explicit addition of reverse arcs. As of now, in function createedges I explicitly do this via a forward edge descriptor (edf) and a reverse edge descriptor (edr). Is there any performance penalty for this especially when the number of max flow problems that need to be solved is in the 1000s? Is there anything that is more efficient than this?

(c) I am able to correctly enumerate the arcs of the minimal S/T cut via the following portion of the code:

    int Eindex = 0;
    for (boost::tie(e_, eend_) = edges(g); e_ != eend_; ++e_) {
        int fr = source(*e_, g);
        int to = target(*e_, g);
        printf("(%d) Edge %d: (%d -> %d), capacity %f\n", Eindex, e[*e_], fr, to, cap[*e_]);
        Eindex++;
        if (is_part_of_source(fr) && is_part_of_source(to) == false)
            printf("----is part of ST Cut-----\n");
        else
            printf("x\n");
    }

Is there any more efficient way or enumerating the arcs of the S/T cut than the above?

Tryer
  • 3,580
  • 1
  • 26
  • 49
  • There might be a better way to solve thousands of max flows on the same graph, but it's hard to say without knowing how the capacity functions are related. – David Eisenstat May 20 '21 at 12:44
  • @DavidEisenstat The capacities can change arbitrarily. That is unlikely to make the solution of the previous iteration even feasible for the new problem. Do you have any linear programming based warm start in mind? – Tryer May 20 '21 at 13:02
  • 3
    I'm not an expert on flows (even though I did spend part of my PhD thinking very deeply about planar flow), but I believe that it should be possible to derive some feasible flow from the previous flow and then use the residual network of the feasible flow as the starting point for optimization. For example you could do it with [arc demands](https://courses.engr.illinois.edu/cs498dl1/sp2015/notes/25-maxflowext.pdf). – David Eisenstat May 20 '21 at 13:11

1 Answers1

1

There's many issues. If you use modern C++ and compiler warnings, you can reduce the code and spot the bugs in printing vertex descriptors (printf is just not safe; use the diagnostics!).

Here's my take after review.

Notable changes:

  • bundled properties instead of separate interior properties

  • this implies passing named arguments (but see https://stackoverflow.com/a/64744086/85371)

  • no more global variables, no more loopy initialization if the simple constructor suffices

  • no more duplicated code (nothing invites error quite like having capacities1 and capacities2 lying around)

  • using clear_vertex instead of just clear_out_edges - this may not make a difference (?) but seems to express intent a bit better

  • no more printf (I'll use libfmt, which is also in c++23), so e.g.

     fmt::print("Max flow {}\nNodes {} are in source subset\n", maxflow,
                vertices(_g) | filtered(is_source));
    

    prints

     Max flow 10
     Nodes {0, 1, 2, 3} are in source subset
    

    all in one go

  • print what you think you are printing. In particular, use the library support for printing edges if you can

Live On Compiler Explorer

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/boykov_kolmogorov_max_flow.hpp>
#include <boost/range/adaptors.hpp>
#include <fmt/ostream.h>
#include <fmt/ranges.h>
using boost::adaptors::filtered;

using Traits = boost::adjacency_list_traits<boost::vecS, boost::vecS, boost::directedS>;
using V      = Traits::vertex_descriptor;
using E      = Traits::edge_descriptor;

using Capacity = double;
using Color    = boost::default_color_type;

struct VertexProps {
    // std::string name;
    Color    color;
    Capacity distance;
    E        predecessor;
};

struct EdgeProps {
    int      id;
    Capacity weight, residual;
    E        reverse;
};

using Graph = boost::adjacency_list<
    boost::vecS, boost::vecS, boost::directedS,
    VertexProps,
    // see https://stackoverflow.com/a/64744086/85371 :(
    boost::property<boost::edge_capacity_t, Capacity, EdgeProps>>;

struct MyGraph {
    MyGraph(size_t nnodes) : _g(nnodes) {}

    void runSimulation(auto const& arcs, auto const& capacities)
    {
        reconfigure(arcs, capacities);

        Capacity maxflow = solve_max_flow(0, 5);

        auto cap       = get(boost::edge_capacity, _g);
        auto is_source = [this](V v) { return _g[v].color == Color::black_color; };

        fmt::print("Max flow {}\nNodes {} are in source subset\n", maxflow,
                vertices(_g) | filtered(is_source));

        for (E e : boost::make_iterator_range(edges(_g))) {
            bool st_cut =
                is_source(source(e, _g)) and
                not is_source(target(e, _g));

            fmt::print("Edge {} (id #{:2}), capacity {:3} {}\n", e, _g[e].id,
                    cap[e], st_cut ? "(ST Cut)" : "");
        }
    }

private:
    Graph _g;

    void reconfigure(auto const& arcs, auto const& capacities)
    {
        assert(arcs.size() == capacities.size());

        for (auto v : boost::make_iterator_range(vertices(_g))) {
            // boost::clear_out_edges(v, g);
            boost::clear_vertex(v, _g);
        }

        auto cap  = get(boost::edge_capacity, _g);
        auto eidx = get(&EdgeProps::id, _g);
        auto rev  = get(&EdgeProps::reverse, _g);

        auto  eindex = 0;

        for (auto [fr, to] : arcs) {
            auto edf  = add_edge(fr, to, _g).first;
            auto edr  = add_edge(to, fr, _g).first;
            eidx[edf] = 2 * eindex;
            eidx[edr] = eidx[edf] + 1;
            cap[edf]  = cap[edr] = capacities[eindex];

            rev[edf] = edr;
            rev[edr] = edf;

            ++eindex;
        }
    }

    Capacity solve_max_flow(V src, V sink)
    {
        return boykov_kolmogorov_max_flow(
            _g, src, sink,
            // named arguments
            boost::reverse_edge_map(get(&EdgeProps::reverse, _g))
                .residual_capacity_map(get(&EdgeProps::residual, _g))
                .vertex_color_map(get(&VertexProps::color,       _g))
                .predecessor_map(get(&VertexProps::predecessor,  _g))
                .distance_map(get(&VertexProps::distance,        _g))
            // end named arguments
        );
    }
};

int main() {
    MyGraph g{6};

    using namespace std;
    for (auto&& [arcs, capacities] : { tuple
            // 1
            {vector{pair{0, 1}, {0, 2}, {1, 2}, {1, 3}, {1, 4},
                                    {2, 4}, {3, 4}, {3, 5}, {4, 5}},
            vector{10, 10, 10, 10, 1, 4, 3, 2, 10}},
            // 2
            {vector{pair{0, 1}, {0, 2}, {1, 3}, {2, 4}, {3, 5}, {4, 5}},
            vector{10, 10, 10, 4, 2, 0}},
        })
    {
        g.runSimulation(arcs, capacities);
    }
}

Prints

Max flow 10
Nodes {0, 1, 2, 3} are in source subset
Edge (0,1) (id # 0), capacity  10 
Edge (0,2) (id # 2), capacity  10 
Edge (1,0) (id # 1), capacity  10 
Edge (1,2) (id # 4), capacity  10 
Edge (1,3) (id # 6), capacity  10 
Edge (1,4) (id # 8), capacity   1 (ST Cut)
Edge (2,0) (id # 3), capacity  10 
Edge (2,1) (id # 5), capacity  10 
Edge (2,4) (id #10), capacity   4 (ST Cut)
Edge (3,1) (id # 7), capacity  10 
Edge (3,4) (id #12), capacity   3 (ST Cut)
Edge (3,5) (id #14), capacity   2 (ST Cut)
Edge (4,1) (id # 9), capacity   1 
Edge (4,2) (id #11), capacity   4 
Edge (4,3) (id #13), capacity   3 
Edge (4,5) (id #16), capacity  10 
Edge (5,3) (id #15), capacity   2 
Edge (5,4) (id #17), capacity  10 
Max flow 2
Nodes {0, 1, 2, 3, 4} are in source subset
Edge (0,1) (id # 0), capacity  10 
Edge (0,2) (id # 2), capacity  10 
Edge (1,0) (id # 1), capacity  10 
Edge (1,3) (id # 4), capacity  10 
Edge (2,0) (id # 3), capacity  10 
Edge (2,4) (id # 6), capacity   4 
Edge (3,1) (id # 5), capacity  10 
Edge (3,5) (id # 8), capacity   2 (ST Cut)
Edge (4,2) (id # 7), capacity   4 
Edge (4,5) (id #10), capacity   0 (ST Cut)
Edge (5,3) (id # 9), capacity   2 
Edge (5,4) (id #11), capacity   0 

Side Note

If you think main is overcomplicated, here's another way to write it for just the two invocations:

Live On Compiler Explorer

g.runSimulation({{0, 1}, {0, 2}, {1, 2}, {1, 3}, {1, 4}, {2, 4}, {3, 4}, {3, 5}, {4, 5}},
                {10, 10, 10, 10, 1, 4, 3, 2, 10});

g.runSimulation({{0, 1}, {0, 2}, {1, 3}, {2, 4}, {3, 5}, {4, 5}},
                {10, 10, 10, 4, 2, 0});
sehe
  • 374,641
  • 47
  • 450
  • 633
  • 1
    Added a simpler invocation style [Live On Compiler Explorer](https://godbolt.org/z/T87Wh1jeP). Sorry if I never got to addressing the graph-theoretical aspects of the question. In fairness, that part may be more for cs.SE/math.SE. I just got lost in reviewing the actual code for problems/ideas. I hope this gave you something to work with. – sehe May 20 '21 at 14:19
  • every post of yours give me something new. regarding clear_vertex in lieu of clear_out_edges : my worry was that it unnecessarily (behind the scenes) allocates and deletes some memory? Hence my preference for retaining as much common between one problem instance to the next. BTW, could you comment on whether clear_vertex or clear_out_edges correctly removes the map entries in which the just deleted descriptors (vertex or edge) are key? My worry is that if that does not happen, somehow the map keep growing bigger and bigger even for a problem with just 6 vertices if it is solved 1000 times? – Tryer May 20 '21 at 14:21
  • 1
    Yes, this will happen, because "there is no map" - not separately. The properties are in the storage container that owns the vertices/edges. So if the edges disappear, so do all stored properties. [It is possible that some amount of non-shrinking happens, but that is likely to be what you want, since it prevents some reallocations. If you want to control the allocation strategies more then I'd suggest using a custom container selector **or** devising your own graph model (see e.g. https://stackoverflow.com/a/56203763/85371) – sehe May 20 '21 at 14:54
  • Personally, I'd postpone micro-optimizing until you have explored all algorithmic improvements – sehe May 20 '21 at 14:55