Optimal placement of objects wrt pairwise similarity weights

Question

Ok this is an abstract algorithmic challenge and it will remain abstract since it is a top secret where I am going to use it.

Suppose we have a set of objects O = {o_1, ..., o_N} and a symmetric similarity matrix S where s_ij is the pairwise correlation of objects o_i and o_j.

Assume also that we have an one-dimensional space with discrete positions where objects may be put (like having N boxes in a row or chairs for people).

Having a certain placement, we may measure the cost of moving from the position of one object to that of another object as the number of boxes we need to pass by until we reach our target multiplied with their pairwise object similarity. Moving from a position to the box right after or before that position has zero cost.

Imagine an example where for three objects we have the following similarity matrix:

     1.0  0.5  0.8

 S = 0.5  1.0  0.1

     0.8  0.1  1.0

Then, the best ordering of objects in the tree boxes is obviously:

 [o_3] [o_1] [o_2]

The cost of this ordering is the sum of costs (counting boxes) for moving from one object to all others. So here we have cost only for the distance between o_2 and o_3 equal to 1box * 0.1sim = 0.1, the same as:

[o_3] [o_1] [o_2]

On the other hand:

[o_1] [o_2] [o_3]

would have cost = cost(o_1-->o_3) = 1box * 0.8sim = 0.8.

The target is to determine a placement of the N objects in the available positions in a way that we minimize the above mentioned overall cost for all possible pairs of objects!

An analogue is to imagine that we have a table and chairs side by side in one row only (like the boxes) and you need to put N people to sit on the chairs. Now those ppl have some relations that is -lets say- how probable is one of them to want to speak to another. This is to stand up pass by a number of chairs and speak to the guy there. When the people sit on two successive chairs then they don't need to move in order to talk to each other.

So how can we put those ppl down so that every distance-cost between two ppl are minimized. This means that during the night the overall number of distances walked by the guests are close to minimum.

Greedy search is... ok forget it! I am interested in hearing if there is a standard formulation of such problem for which I could find some literature, and also different searching approaches (e.g. dynamic programming, tabu search, simulated annealing etc from combinatorial optimization field).

Looking forward to hear your ideas.

PS. My question has something in common with this thread Algorithm for ordering a list of Objects, but I think here it is better posed as problem and probably slightly different.

Could you add more explanation or examples? I don't understand why the zero cost cases are zero cost. The cost matrix is all positive, and each result includes one pair that are separated by another box. — Patricia Shanahan, Nov 22 '12 at 00:41
When you move from box 1 to box 2 you don't bypass any boxes. So the cost is 0boxes * similarity = 0. You may consider the similarity as cost in that case because it won't change the problem, but the minimum possible cost of an ordering will not be zero. — argyris, Nov 22 '12 at 00:45
I don't have any problem with zero cost for moving from box 1 to box 2, or box 2 to box 3. I'm confused about the cost of moving from box 1 to box 3 in each case. — Patricia Shanahan, Nov 22 '12 at 01:19
Imagine an analogue: you have a table and chairs in one row only (the boxes) and you need to put N people to sit on chairs. Now those ppl have some relations that is -lets say- how probable is one of them to want to speak to another. So how are we going to put those ppl down so that every distance-cost between two ppl are minimized. — argyris, Nov 22 '12 at 09:40
@Patricia, sorry you were right. The cost in those cases is 0.1 since moving from 1st box to 3rd has some cost. I've edited the question. — argyris, Nov 22 '12 at 13:27

score 1 · Answer 1 · answered Nov 23 '12 at 07:10

That sounds like an instance of the Quadratic Assignment Problem. The speciality is due to the fact that the locations are placed on one line only, but I don't think this will make it easier to solve. The QAP in general is NP hard. Unless I misinterpreted your problem you can't find an optimal algorithm that solves the problem in polynomial time without proving P=NP at the same time.

If the instances are small you can use exact methods such as branch and bound. You can also use tabu search or other metaheuristics if the problem is more difficult. We have an implementation of the QAP and some metaheuristics in HeuristicLab. You can configure the problem in the GUI, just paste the similarity and the distance matrix into the appropriate parameters. Try starting with the robust Taboo Search. It's an older, but still quite well working algorithm. Taillard also has the C code for it on his website if you want to implement it for yourself. Our implementation is based on that code.

There has been a lot of publications done on the QAP. More modern algorithms combine genetic search abilities with local search heuristics (e. g. Genetic Local Search from Stützle IIRC).

As far as I understand by reading some definitions online, this is a close formulation. The 1D or 2D space does not make any difference. Probably when the locations are on a line it is possible to take advantage and avoid some checks because you have a specific ordering. — argyris, Nov 23 '12 at 11:13
In fact this special case of QAP is called "linear arrangement problem" (http://www.opt.math.tu-graz.ac.at/~cela/papers/qap_bericht.pdf) — argyris, Nov 23 '12 at 11:37
I don't think this classifies as a linear problem. You do have the correlations between your objects that you want to minimize over the assigned distances. That makes the problem quadratic. In the linear assignment problem the assignment is not influenced by the relative position of the other objects, only by the absolute position. — Andreas, Nov 23 '12 at 12:06

argyris · Answer 2 · 2012-11-22T17:09:27.103

Let me help the thread (of my own) with a simplistic ordering approach.

1. Order the upper half of the similarity matrix.
2. Start with the pair of objects having the highest similarity weight and place them in the center positions.

3. The next object may be put on the left or the right side of them. So each time you may select the object that when put to left or right has the highest cost to the pre-placed objects. Goto Step 2.

The selection of Step 3 is because if you left this object and place it later this cost will be again the greatest of the remaining, and even more (farther to the pre-placed objects). So the costly placements should be done as earlier as it can be.

This is too simple and of course does not discover a good solution.

Another approach is to

1. start with a complete ordering generated somehow (random or from another algorithm)

2. try to improve it using "swaps" of object pairs.

I believe local minima would be a huge deterrent.

That's basically [First Fit Decreasing](http://docs.jboss.org/drools/release/5.5.0.Final/drools-planner-docs/html_single/index.html#d0e6126), right? — Geoffrey De Smet, Nov 22 '12 at 15:22
@Geoffrey De Smet. Hm, I think it resembles but has differences also. For example in FFD you "expect" that a costly move is going to be more costly or inevitable later. Here, you are certain that as later a choice is made the more cost will have. Except if an object o has no similarity with the already placed objects and those that are going to be placed until you put o. — argyris, Nov 22 '12 at 16:23

score 0 · Answer 3 · answered Nov 22 '12 at 17:08

0

Here's a variation of the already posted method. I don't think this one is optimal, but it may be a start.

Create a list of all the pairs in descending cost order.
While list not empty:
  Pop the head item from the list.
  If neither element is in an existing group, create a new group containing
     the pair.
  If one element is in an existing group, add the other element to whichever
     end puts it closer to the group member.
  If both elements are in existing groups, combine them so as to minimize
     the distance between the pair.

Group combining may require reversal of order in a group, and the data structure should be designed to support that.

answered Nov 22 '12 at 17:08

Patricia Shanahan

25,849
4
38
75

Thanks for the answer. However lets see some points. a) In the second "if" statement you may decide to put an object in a group by looking only one pairwise similarity (cost). But let say the object that has not yet been assigned probably have higher sum of similarities with other objects rather those in the assigned group. b) the combination of two groups is again too naive and looks only to minimize the cost of one pair. – argyris Nov 22 '12 at 17:18
Think in the scale of hundreds or thousands of objects to see what deficiencies would have any simplistic algorithm such the one I mentioned or this one. – argyris Nov 22 '12 at 17:19
Now I'm not sure about the question again. I was under the impression that the objective was to minimize the maximum cost of an exchange. The comment about "sum of similarities" suggests that we should be minimizing the sum of costs of exchanges. – Patricia Shanahan Nov 22 '12 at 17:45
I've inserted the "analogue" with the chairs inside the question's text. I hope that explains that we need to minimize the overall cost which is the sum of the pairwise costs. – argyris Nov 22 '12 at 19:57
OK - I'll continue thinking about it on the basis that we are minimizing the sum of the pairwise exchange costs. – Patricia Shanahan Nov 23 '12 at 16:59

Optimal placement of objects wrt pairwise similarity weights

3 Answers3