How to choose elements that sum up to n?

Question

Say I have the following dataframe

x <- c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y <- c(1,4,3,5,5,7,2,2,6,8)
df <- data.frame(x,y)

say x represents players and y represents goals. I want all the subsets of players whose sum of goals is 10, say

{p1,p3,p9},{p3,p6},{p7,p8,p9}...

Not sure if this helps, but it seems related: http://stackoverflow.com/q/31572497/1191259 You probably want to impose a team size to keep things feasible (fine for only 10 players, but if you had another 20 players with 0 or 1 goals, it'll get ugly). ... And another: http://stackoverflow.com/q/32855755/1191259 — Frank, Oct 25 '15 at 12:48
Incidentally, your `cbind` call is completely redundant. Leave it out. — Konrad Rudolph, Oct 25 '15 at 14:25

G. Grothendieck · Accepted Answer · 2015-10-25T19:33:49.197

1) lpSolve This can be done using integert linear programming. We use an objective of c(0, ..., 0) and a one row matrix consisting of y as the constraint matrix. The right hand side of the constraint must equal n, i.e. 10.

library(lpSolve)
y <- c(1,4,3,5,5,7,2,2,6,8)
n <- length(y)
k <- sum(cumsum(sort(y)) <= n) + 1 # upper bound to no of players in group
out <- lp(objective = numeric(n), 
   const.mat = matrix(y, 1), const.dir = "==", const.rhs = n,
   all.bin = TRUE, num.bin.solns = sum(choose(n, 1:k)))

# solution vector seems to have junk at end so truncate it and reshape to matrix
soln <- matrix(head(out$solution, n * out$num.bin.solns), n)

It found a total of 19 solutions:

> out
Success: the objective function is 0 
         19 solutions returned

> out$num.bin.solns
[1] 19

> dim(soln)
[1] 10 19

The columns of soln are the feasible solutions. For example, the first solun is players 1, 2 and 4:

> soln[, 1]
 [1] 1 1 0 1 0 0 0 0 0 0
> which(soln[, 1]==1)
[1] 1 2 4

We can list the solutions as strings like this:

> x <- c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
> apply(soln == 1, 2, function(v) toString(x[v]))
 [1] "p1, p2, p4"     "p4, p5"         "p3, p4, p7"     "p1, p4, p7, p8"
 [5] "p1, p2, p3, p8" "p1, p2, p3, p7" "p1, p3, p9"     "p3, p4, p8"    
 [9] "p1, p2, p5"     "p3, p5, p7"     "p2, p9"         "p3, p5, p8"    
[13] "p3, p6"         "p1, p5, p7, p8" "p1, p6, p7"     "p1, p6, p8"    
[17] "p7, p8, p9"     "p8, p10"        "p7, p10"

2) wle A second approach is to create all 10^2 subsets of 1:10 as binary vectors v and then select out those for which y %*% v == 10 (where y is from the question). This approach results in concise code and would be OK as long as y is not too long.

library(wle)
m <- sapply(0:(2^10-1), function(x) binary(x, 10)$binary)
soln2 <- m[, y %*% m == 10]

Use the same method as in (1) to turn that into a vector of strings if that form is preferred.

Update: Some corrections and improvements and added (2).

Thank you so much for the answer... can i set minimum number of players in team? say minimun 3 players — areddy, Oct 25 '15 at 14:48
To constrain the number of players in the solution to k=3: k <- 3; out <- lp(objective = numeric(n), const.mat = rbind(y, 1), const.dir = c("==", "<="), const.rhs = c(n, k), all.bin = TRUE, num.bin.solns = choose(n,k)) Alternately use the soln in the answer and just select out those columns with no more than k ones, i.e. k <- 3; soln <- soln[, colSums(soln) <= k] where soln is as calcualted in the answer. — G. Grothendieck, Oct 25 '15 at 15:21

score 2 · Answer 2 · answered Oct 25 '15 at 13:22

You can use this bruteforce approach: Result would be:

 [1] "p2,p9"       "p3,p6"       "p4,p5"       "p7,p10"      "p8,p10"      "p1,p2,p4"    "p1,p2,p5"   
 [8] "p1,p3,p9"    "p1,p6,p7"    "p1,p6,p8"    "p3,p4,p7"    "p3,p4,p8"    "p3,p5,p7"    "p3,p5,p8"   
[15] "p7,p8,p9"    "p1,p2,p3,p7" "p1,p2,p3,p8" "p1,p4,p7,p8" "p1,p5,p7,p8"

Data:

x<-c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y<-c(1,4,3,5,5,7,2,2,6,8)
df<-data.frame(x=x,y=y, stringsAsFactors = FALSE)
df$id <- seq_len(nrow(df)) # Adding an ID column

Get all possible combinations of upto `max_comb` elements

max_comb <- nrow(df)
my_combn <- function(m, x){
  combn(x, m, simplify = FALSE)
}
dat <- lapply(1:max_comb, my_combn, df$id)

Set names to the picked combinations

combn_names <- function(ind, vec, collapse = ", "){
  paste(vec[ind], collapse = collapse)
}

set_list_combn_names <- function(l, vec){
  setNames(l, lapply(l, combn_names, vec = vec))
}
dat <- lapply(dat, set_list_combn_names, df$x)

Check if sum is equal to `x=10` and output names of combinations

sum_equal_x <- function(ind, vec, x){
  sum(vec[ind]) == x
}
names(which(unlist(lapply(dat, lapply, sum_equal_x, df$y, 10))))

Result:

> names(which(unlist(lapply(dat, lapply, sum_equal_x, df$y, 10))))
 [1] "p2,p9"       "p3,p6"       "p4,p5"       "p7,p10"      "p8,p10"      "p1,p2,p4"    "p1,p2,p5"   
 [8] "p1,p3,p9"    "p1,p6,p7"    "p1,p6,p8"    "p3,p4,p7"    "p3,p4,p8"    "p3,p5,p7"    "p3,p5,p8"   
[15] "p7,p8,p9"    "p1,p2,p3,p7" "p1,p2,p3,p8" "p1,p4,p7,p8" "p1,p5,p7,p8"

score 1 · Answer 3 · answered Oct 25 '15 at 13:08

You could do something like this:

require(utils)

x<-c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y<-c(1,4,3,5,5,7,2,2,6,8)
df<-data.frame(cbind(x=x,y=y))

search.val <- 10
max.num    <- length(x)

all.comb <- lapply(1:max.num, function(n){ combn(x,n) })

# Calcualte sum
#   Foreach combination length 1:n
sums <- lapply(all.comb, function(comb.mat){
  # Foreach combination of length n
  apply(comb.mat,2,function(col){
    sum(as.numeric( df[which(df$x %in% col),]$y ))
  })
})

# Find which combinations have sum 10
vals <- lapply(1:max.num,function(i){
  sum.vect <- sums[[i]]
  inds     <- which(sum.vect == search.val)
  lapply(inds, function(j){
    all.comb[[i]][,j]
  })
})

sum.of.10 <- unlist(vals,recursive=FALSE)

This is a brute force solution where all sums of all combinations upp to length max.num is calculated using the combn function.

How to choose elements that sum up to n?

3 Answers3

Data:

Get all possible combinations of upto max_comb elements

Set names to the picked combinations

Check if sum is equal to x=10 and output names of combinations

Get all possible combinations of upto `max_comb` elements

Check if sum is equal to `x=10` and output names of combinations