6

I'm fairly new to R and I'm trying to write a script for what I used to do with Solver in Excel. In my data below, I have a list of workers with job types A-E. Each worker has a salary and a production rate. What I want R to do is find the maximum production I can get from 10 workers with a cumulative salary <100,000. The restraints are that I need an exact total of 10 workers and I need 2 from job types A-D, 1 from E, and 1 of any type.

I've searched and searched for a way to do this with optim, IpSolve, etc., but with my limited knowledge I have not had much luck.

Thank you for your help!

Name    Pos Salary  Producton
Joe     A   12001   13.1
Jim     A   17753   23.5
Jill    A   11447   14.8
Brian   A   11447   14.8
Sally   B   2171    1.2
Nancy   B   4537    2.1
Francis B   2840    1.8
Ace     B   2840    1.8
Bill    C   3818    1.6
Ted     C   11447   0.1
Henry   C   2000    1.1
Kyle    C   3818    1.6
Sam     D   11447   0.1
Trevor  D   2000    1.1
John    D   4317    11.7
Jerome  D   2000    1.1
Rebecca E   3818    1.6
Sunny   E   11447   0.1
Britt   E   2000    1.1
Sara    E   4317    11.7
Blue Magister
  • 13,044
  • 5
  • 38
  • 56
  • Yes, a minimum of 2. Thank you! – Derek Pierson Oct 03 '13 at 14:56
  • Just a thought: choose(20,10) = 184756, so it won't take all that long to test every possible combination in this small case. Unless, of course, this is homework and you *must* use a solver. – Carl Witthoft Oct 03 '13 at 15:05
  • Luckily it's not homework, but the full list has over three hundred people in it. My mistake, I should have mentioned that in the original post. – Derek Pierson Oct 03 '13 at 15:14

1 Answers1

6

Use lp in the lpSolve package to solve the underlying integer programming problem. The first 5 constraints are on the number of A, B, C, D and E positions respectively, the 6th is on the number of staff to choose and the 7th is on the total salary. Assuming DF is the data frame shown in the question try this:

library(lpSolve)

obj <- DF$Prod
con <- rbind(t(model.matrix(~ Pos + 0, DF)), rep(1, nrow(DF)), DF$Salary)
dir <- c(">=", ">=", ">=", ">=", ">=", "==", "<")
rhs <- c(2, 2, 2, 2, 1, 10, 100000)

result <- lp("max", obj, con, dir, rhs, all.bin = TRUE)

which gives:

> result
Success: the objective function is 84.7 
> DF[result$solution == 1, ]
     Name Pos Salary Producton
2     Jim   A  17753      23.5
3    Jill   A  11447      14.8
4   Brian   A  11447      14.8
6   Nancy   B   4537       2.1
8     Ace   B   2840       1.8
9    Bill   C   3818       1.6
12   Kyle   C   3818       1.6
14 Trevor   D   2000       1.1
15   John   D   4317      11.7
20   Sara   E   4317      11.7

Note that Production is misspelled in the question or perhaps that was intended.

ADDED:

Regarding the second best solution the idea is to add a constraint which makes the best solution infeasible but does not exclude other potential solutions:

con2 <- rbind(con, result$solution)
dir2 <- c(dir, "<=")
rhs2 <- c(rhs, 9)
result2 <- lp("max", obj, con2, dir2, rhs2, all.bin = TRUE)

In this case we get the following which has the same optimum objective value as the best solution so it would be just as good:

> result2
Success: the objective function is 84.7 
> DF[result2$solution == 1, ]
     Name Pos Salary Producton
2     Jim   A  17753      23.5
3    Jill   A  11447      14.8
4   Brian   A  11447      14.8
6   Nancy   B   4537       2.1
8     Ace   B   2840       1.8
9    Bill   C   3818       1.6
12   Kyle   C   3818       1.6
15   John   D   4317      11.7
16 Jerome   D   2000       1.1
20   Sara   E   4317      11.7

There are also arguments to lp which allow it to produce multiple solutions directly; however, the help file mentions some bugs and it may be safer to take the above approach.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Amazing, thank you so much! As icing on the cake, is there a way to get the 2nd best solution? – Derek Pierson Oct 03 '13 at 15:25
  • Question: I was thinking of an algorithm like: 1) calculate a figure of merit, i.e. Production/Salary, 2)Grab the top FOM candidate from every required category (A-E), 3) fill the remaining slots with the top remaining FOM values. Is that essentially what `lpSolve` is doing? – Carl Witthoft Oct 03 '13 at 15:33
  • @Derek, Have added a second-best solution. @Carl, `lp` uses the branch-and-bound algorithm. – G. Grothendieck Oct 03 '13 at 16:04
  • Fantastic! My final question is can we add code to exclude a specific person, say "Jim"? As if he called in sick and was no longer an option. I know I could delete him from DF, but it would be nice if I could simply change it in the model. Thanks again! – Derek Pierson Oct 03 '13 at 16:29
  • Just add a constraint that sets the Jim variable to 0. That is append this constraint to the constraint matrix `as.numeric(DF$Name == "Jim")`, append this to the direction "==" and append this to the `rhs`: `0` . – G. Grothendieck Oct 03 '13 at 16:36