3

Lets define the variance of a graph as the variance of its edges' weights. The task I am trying to solve is designing an algorithm which given a graph G will construct a spanning tree T with minimum variance.

I am having a hard time getting the ball rolling on this. So far I've noticed that if the average edge weight in such a spanning tree was known then the problem could be solved by replacing the weight of every edge with the square of its deviation the the average weight and feeding the graph into any MST finding algorithm.

Obviously, I do not know anything about the average edge weight in the tree I am trying to construct, however it occured to me that the average must fall between 2 edges of G and perhaps this information could be exploited.

I'm trying to reduce the problem to finding the MST of G with modified edge weights. I thought about running an algorithm for each edge of G, each time assuming that the initial edge is the closest to average in T and given weight 0 while the other edges get weights equal to the square of their deviation from the weight of the initial edge. This approach went nowhere because if the average isn't equal to the weight of one of the edges, then depending on where it falls between the weights of 2 closest edges the ordering of edges based on their weight would be different and would produce different spanning trees when fed into an MST finding algorithm. This is something I don't know how to handle (or if it even can be handled).

This is homework, so I'd prefer a small hint to send me in the right direction rather than a solution.

A.F.K.
  • 69
  • 7

1 Answers1

2

Idea 1: You only need to be able to compare edges pairwise when you build a minimum weight spanning tree.

Idea 2:

The pairwise comparison of an edge of weight x and an edge of weight y, when you square the difference between the weights and the guess g, only changes at g=(x+y)/2.

Idea 3:

If there are |E| edges, there are at most (|E| choose 2)+1 essentially different guesses g for the average weight. Compute a minimum weight spanning tree for each of these. Compare the variances of these trees.

There should be much faster ways, but this is a polynomial time algorithm.

Douglas Zare
  • 3,296
  • 1
  • 14
  • 21
  • Hmmmmmm, you're saying that there are O(n^2) midpoints between all pairs of edges and for any two points b and c if b and c have no such midpoints between them, then new weights calculated around b and c as guesses for the average would produce the same spanning tree when fed to and MST finding algorithm. If i got it right then I'll post a solution by wednesday - I'm a bit busy tomorrow and on monday. Thanks for the hint :) – A.F.K. Mar 29 '15 at 01:21
  • 1
    The idea ran into another wall... Suppose that we test one of our O(m^2) guesses for the average weight: we assign new weight to edges, equal to the square of the deviation of the weight from the average weight and we feed the graph into and MST finging algo. We have no guarantee that the resulting tree will have an average weight equal to our guess, because the algorithm can (in an extreme case) always pick edges which lie on the same side of our guess, so that the resulting tree will have a greater average... – A.F.K. Apr 01 '15 at 17:03
  • @A.F.K.: Good point. Can you construct an example where this algorithm fails, or is it just something else that has to be proved? – Douglas Zare Apr 02 '15 at 01:42
  • I need to prove the following lemma: if the average in the optimal tree, Topt, was known then after replacing the weights of all edges of the input graph, G,with the square of their deviation from the mean, an MST finding algorithm ( Kruskal seems to be the most appropriate here ) would select the same set of edges as Topt. I've actually presented my solution today and class ended before I could present my proof for this lemma. Which was nice because my proof by induction was so weak that it didn't even convince me ;). Next week I'll have to prove it though. But I want to get there on my own. – A.F.K. Apr 09 '15 at 19:33
  • I will post a solution by next friday, maybe sooner. Will I be able to answer my own question if I choose your answer as the most helpful? – A.F.K. Apr 09 '15 at 19:33
  • Yes, it is a good idea to post your own solution, and if your answer (or someone else's) is more complete than mine, you should accept that instead. – Douglas Zare Apr 09 '15 at 19:45
  • I give up. I don't know how to prove the correctness of this algorithm. The correctness has been confirmed by my professor, who even has an improved version that constructs the first tree using Kruskal's algorithm and each next one in O(1) time by exploiting the fact that when our guess for the average moves past a single midpoint between edges, only 2 neighboring edges (in terms of deviation from the guess) swap places in the ordering... Anyway, I've twisted my brain into a knot trying to prove the lemma above and all I came up with is some silliness that doesn't even convince me. Please help – A.F.K. Apr 15 '15 at 19:44
  • It seems I gave up too soon :D – A.F.K. Apr 15 '15 at 23:27
  • 1
    I can prove the correctness of the algorithm using a small additional lemma: for any set of real numbers A={a_1, a_2, ..., a_m} the sum of the squares of deviations of these numbers from a point r is the smallest when r is equal to the arithmetic average of A. I'm far too tired to do this now, but it doesn't seem very hard, so I'll do it before class tomorrow :D – A.F.K. Apr 15 '15 at 23:34
  • Good choice of lemma. Yes, that one is not hard, and I think I see how you are using it. It's something you might run across in least squares optimization. – Douglas Zare Apr 15 '15 at 23:54