27

I have an unweighted, connected graph. I want to find a connected subgraph that definitely includes a certain set of nodes, and as few extras as possible. How could this be accomplished?

Just in case, I'll restate the question using more precise language. Let G(V,E) be an unweighted, undirected, connected graph. Let N be some subset of V. What's the best way to find the smallest connected subgraph G'(V',E') of G(V,E) such that N is a subset of V'?

Approximations are fine.

skaffman
  • 398,947
  • 96
  • 818
  • 769
hyluka
  • 273
  • 3
  • 4

5 Answers5

14

This is exactly the well-known NP-hard Steiner Tree problem. Without more details on what your instances look like, it's hard to give advice on an appropriate algorithm.

Falk Hüffner
  • 4,942
  • 19
  • 25
  • I hadn't heard of Steiner trees, but I had reached the Wikipedia page for k-minimum spanning trees, which I think is the same thing as the MST version of Steiner trees. – hyluka Oct 20 '10 at 15:51
  • I don't think this is exactly the Steiner Tree problem. First, the value being optimized is the number of vertices rather than length of edges, and second, you can't add arbitrary vertices. You are right though that this problem is close and investigating its solutions might give insight into the problem at hand. – Gintautas Miliauskas Oct 21 '10 at 06:42
  • 4
    If you set the edge lengths to 1, the number of edges will be minimized, and since the resulting subgraph must be a tree, which has exactly one edge less than vertices, this is the same as minimizing the number of vertices. I don't understand what you mean by "you can't add arbitrary vertices". – Falk Hüffner Oct 21 '10 at 15:40
  • "Arbitrary vertices" refers to the "Euclidean Steiner tree" problem, which, according to Wikipedia, is the Steiner tree's original form. http://en.wikipedia.org/wiki/Steiner_tree – hyluka Oct 22 '10 at 04:54
  • 2
    I see, the Wikipedia page is indeed a bit confusing there. I was referring to the general Steiner Tree problem, also known as "Steiner Tree in Graphs". – Falk Hüffner Oct 22 '10 at 12:18
  • Hello, I have a quite similar problem and I post a question here: http://stackoverflow.com/questions/38543608/find-the-minimal-subgraph-that-connects-the-root-node-and-several-special-nod. Could you please shed some light on it..? Thank you! – lllllllllllll Jul 23 '16 at 15:51
6

I can't think of an efficient algorithm to find the optimal solution, but assuming that your input graph is dense, the following might work well enough:

  1. Convert your input graph G(V, E) to a weighted graph G'(N, D), where N is the subset of vertices you want to cover and D is distances (path lengths) between corresponding vertices in the original graph. This will "collapse" all vertices you don't need into edges.

  2. Compute the minimum spanning tree for G'.

  3. "Expand" the minimum spanning tree by the following procedure: for every edge d in the minimum spanning tree, take the corresponding path in graph G and add all vertices (including endpoints) on the path to the result set V' and all edges in the path to the result set E'.

This algorithm is easy to trip up to give suboptimal solutions. Example case: equilateral triangle where there are vertices at the corners, in midpoints of sides and in the middle of the triangle, and edges along the sides and from the corners to the middle of the triangle. To cover the corners it's enough to pick the single middle point of the triangle, but this algorithm might choose the sides. Nonetheless, if the graph is dense, it should work OK.

Gintautas Miliauskas
  • 7,744
  • 4
  • 32
  • 34
5

The easiest solutions will be the following:

a) based on mst: - initially, all nodes of V are in V' - build a minimum spanning tree of the graph G(V,E) - call it T.
- loop: for every leaf v in T that is not in N, delete v from V'.
- repeat loop until all leaves in T are in N.

b) another solution is the following - based on shortest paths tree.
- pick any node in N, call it v, let v be a root of a tree T = {v}. - remove v from N.

  • loop: 1) select the shortest path from any node in T and any node in N. the shortest path p: {v, ... , u} where v is in T and u is in N. 2) every node in p is added to V'. 3) every node in p and in N is deleted from N. --- repeat loop until N is empty.

At the beginning of the algorithm: compute all shortest paths in G using any known efficient algorithm.

Personally, I used this algorithm in one of my papers, but it is more suitable for distributed enviroments. Let N be the set of nodes that we need to interconnect. We want to build a minimum connected dominating set of the graph G, and we want to give priority for nodes in N. We give each node u a unique identifier id(u). We let w(u) = 0 if u is in N, otherwise w(1). We create pair (w(u), id(u)) for each node u.

  • each node u builds a multiset relay node. That is, a set M(u) of 1-hop neigbhors such that each 2-hop neighbor is a neighbor to at least one node in M(u). [the minimum M(u), the better is the solution].

  • u is in V' if and only if: u has the smallest pair (w(u), id(u)) among all its neighbors. or u is selected in the M(v), where v is a 1-hop neighbor of u with the smallest (w(u),id(u)).

-- the trick when you execute this algorithm in a centralized manner is to be efficient in computing 2-hop neighbors. The best I could get from O(n^3) is to O(n^2.37) by matrix multiplication.

-- I really wish to know what is the approximation ration of this last solution.

I like this reference for heuristics of steiner tree: The Steiner tree problem, Hwang Frank ; Richards Dana 1955- Winter Pawel 1952

AJed
  • 578
  • 6
  • 11
  • While I like this approach best, both are heuristics, right? A) clearly because the MST might choose random edges when there is a draw that might not be draws in the subgraph, and B) because of the random starting point that could lead a different traversal. But b) seems a more reliable heuristic to me - is it? Is there a way to chose a "good" starting point? Last, is there a way to estimate which heuristic would be worse, given a known graph, if there are cases where a) is better? – fnl Feb 08 '15 at 22:37
2

You could try to do the following:

  1. Creating a minimal vertex-cover for the desired nodes N.

  2. Collapse these, possibly unconnected, sub-graphs into "large" nodes. That is, for each sub-graph, remove it from the graph, and replace it with a new node. Call this set of nodes N'.

  3. Do a minimal vertex-cover of the nodes in N'.

  4. "Unpack" the nodes in N'.

Not sure whether or not it gives you an approximation within some specific bound or so. You could perhaps even trick the algorithm to make some really stupid decisions.

aioobe
  • 413,195
  • 112
  • 811
  • 826
0

As already pointed out, this is the Steiner tree problem in graphs. However, an important detail is that all edges should have weight 1. Because |V'| = |E'| + 1 for any Steiner tree (V',E'), this achieves exactly what you want. For solving it, I would suggest the following Steiner tree solver (to be transparent: I am one of the developers):

https://scipjack.zib.de/

For graphs with a few thousand edges, you will usually get an optimal solution in less than 0.1 seconds.

daniel
  • 71
  • 2