0

Does anyone know of R functions to solve the assignment problem from combinatoric optimization.

For example: Say I have N=3 students (1,2,3) and S=4 possible job placements (A,B,C,D). Each of the N=3 students ranks all of the S=4 job placements. 1 placement would obviously get no student (missing denoted as "."). This is a reality (since N is less than S) and is fine. There are (I think) 4! possible allocations here:

A B C D
1 2 3 .
1 . 2 3   
1 3 . 2
1 . 3 2
1 3 2 .

And so on for the next 18 possible ways...

So if N and S are small I can consider all possible "realities" of students being allocated to jobs. I can sum the ranks from each "reality". And choose the reality with the minimum sum rank as being (in a fair system) the one where most students get what they want.

Chris
  • 3,401
  • 5
  • 33
  • 42
  • Your intuition that this is not a suitable question for SO is probably correct. You are basically asking for a combination of tool recommendations and general opinions, which typically isn't what we aim for here. I would recommend seeking the advice of an expert in this area, probably someone who deals with combinatorics, or combinatorial optimization. – joran Jan 15 '14 at 22:03

2 Answers2

5

For purposes of an example in R assume matrix m below where each row is a student and each column is a job and 1 means the student's top choice, 2 means the second choice and so on. 9 means the student did not rank the job. There were 3 students and 4 tasks so we added a dummy student, U, of all 9s for the last row so that the number of students and tasks are the same. We assume the objective is to minimize the sum of the ranks. Below we see that the best assignment is to assign student 1 to job C, student 2 to job D, student 3 to job A and the unassigned row slurps up job B.

m <- matrix(c(3, 2, 1, 9, 2, 3, 2, 9, 1, 9, 3, 9, 9, 1, 9, 9), 4,
       dimnames = list(c(1, 2, 3, "U"), c("A", "B", "C", "D")))

#   A B C D
# 1 3 2 1 9
# 2 2 3 9 1
# 3 1 2 3 9
# U 9 9 9 9

library(lpSolve)
fm <- lp.assign(m)

At this point fm$solution contains the solution, a matrix of the same dimensions as m with 0 and 1 entries.

Note: If the solution is a permutation matrix except possibly for some rows that are all zero then this will give the assignment:

student <- rownames(m)
ix <- round(fm$solution %*% seq_len(ncol(m)))
job <- colnames(m)[ifelse(ix == 0, NA, ix)]
data.frame(student, job)

The last line gives the following so in this case each student got their first choice:

   student  job
1        1    C
2        2    D
3        3    A
4        U    B

Note that there could be more than one minimizing solution, e.g. if two students choose the same rankings.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thx for the `lp.assign()` suggestion. My real problem has 188*188 cost matrix. I get output `Success: the objective function is 1171` for my problem. Each of the 188 rowsums and colsums of `fm$solution` table is 1. When I use `job <- colnames(m)[apply(fm$solution == 1, 1, which)]` I get error: `Error in colnames(m)[apply(fm$solution == 1, 1, which)]: invalid subscript type 'list'`. Some elements of are `integer(0)`. Why is `which()` not correctly assigning the column number where to the student for which the specific job is evaluating to 1? – Chris Jan 17 '14 at 02:49
  • I assumed that in `fm$solution` there would be one 1 in each row and in each column and everything else is 0. Seems that is not the case here. Look at `fm$solution` to determine what is going on. – G. Grothendieck Jan 17 '14 at 04:14
  • I looked a bit more closely at my matrix...The output from `lp.assign()` looks like a permutation matrix (perhaps in disguise). However when I logically evaluate whether the `[i,j]` elements that look like 1's and are givining me problems are 1's they evaluate to `FALSE`. Hence why `which()` fails when you look for exact 1's. I got around this by applying `round()` to the `fm$solution()` matrix. So my in disguise 1's get rounded to 1 (exact), and then are found by the `which()` function. Make sense? I hope? I think it's a bug with lp.assign() and the type of solution matrix it creates? – Chris Jan 17 '14 at 16:13
  • So I guess the only edit to your very helpful example is: `job <- colnames(m)[apply(round(fm$solution) == 1, 1, which)]` – Chris Jan 17 '14 at 16:19
  • OK. I have added `round` and also added some code to handle the case that there might be a row of all zeros although I don't know for a fact whether that is possible or not. Based on the documentation it should not be possible. – G. Grothendieck Jan 17 '14 at 16:46
  • I checked all the solutions provided by `1p.assign()` in this instance (i.e. what students got assigned to what rank of job). And in spite of the problem with the 1's not being exactly equal to 1 the optimization is actually still fairly/very good. 70% assigned to first choice. 20% assigned second choice. Etc. Thx so much again for the helpful code. – Chris Jan 17 '14 at 20:23
1

The problem you are trying to solve is a famous one in Operations Research called the Assignment Problem.

The greedy algorithm doesn't work, because it depends strongly on the order in which it assigns the students and can make very bad decisions.

You could solve it trivially using the itertools library from python to loop over all permutations (as in How to generate all permutations of a list in Python) and choose the best one, but it is really, really expensive that way (there are factorial of #jobs possible assignments)... although it's a way to debug your algorithm ;).

As the Wikipedia article says, you can solve it with the Hungarian Algorithm (but I wouldn't try to implement it myself, it is a little tricky), or with a Minimum Cost Flow algorithm.

There's an implementation already in Python, that comes with examples:

https://pypi.python.org/pypi/munkres/

To install it you can simply use pip:

pip install munkres
Community
  • 1
  • 1
Gastón Bengolea
  • 855
  • 6
  • 13