1

I have the following code in R (published by Mark - Finding all possible combinations of numbers to reach a given sum) that allows me to add numbers upto the target value. However, I want the code to return all possible combinations without repeating the same numbers and also store the removed numbers in a separate list. Do I need to assign index values?

x <- c(55,10,13,26,34,72,51,96,13)

subset_sum = function(numbers,target,partial=0){
  if(any(is.na(partial))) return()
  s = sum(partial)
  if(s > target) return()
  if(between(s,target-3,target)) print(sprintf("%s = %s",paste(partial[-1],collapse=" "),s))
  for(i in seq_along(numbers)){
      n = numbers[i]
      remaining = numbers[(i+1):length(numbers)]
      subset_sum(remaining,target,c(partial,n))
    }
}

subset_sum(x,60)

#The actual output is:
[1] "10 13 34 = 57"
[1] "26 34 = 60"

#I expect the output as (no repetition of 34):
[1] 26 34 = 60

Thanks in advance!

Community
  • 1
  • 1
AM_123
  • 205
  • 1
  • 2
  • 10

1 Answers1

0

The package RcppAlgos (I am the author) was built specifically for this task. Below is a solution that utilizes comboGeneral which is used to generate combinations with or without repetition as well as of multisets. We can also constrain the output with the use of constraintFun, comparisonFun, and limitConstraints:

library(RcppAlgos)
mySubsetSum <- function(x, myTarget, repeats = FALSE) {
    if (0 %in% x)
        x <- x[-which(x == 0)]

    if (repeats) {
        tbl <- table(x)
        vals <- as.numeric(names(tbl))
        myfreqs <- unname(tbl)
    } else {
        vals <- unique(x)
        myfreqs <- rep(1, length(vals))
    }

    ## Add zero length(vals) - 1 times
    myfreqs <- c(length(vals) - 1, myfreqs)
    vals <- c(0L, vals)

    combs <- comboGeneral(vals, length(vals), freqs = myfreqs, 
                          constraintFun = "sum", 
                          comparisonFun = "==", 
                          limitConstraints = myTarget)

    lapply(1:nrow(combs), function(x) combs[x, combs[x,] != 0])
}

mySubsetSum(x, 60)
[[1]]
[1] 26 34

mySubsetSum(x, 60, TRUE)
[[1]]
[1] 26 34

[[2]]
[1] 13 13 34

It is written in C++ for maximum speed. Here is an example demonstrating it's efficiency:

set.seed(42)
s <- sample(-50:50, 20)
s
[1]  42  43 -22  31  12  -1  19 -38  11  14  -9  41  33 -28 -10  30  38 -41 -11  -5

system.time(bigTest <- mySubsetSum(s, 170))
 user  system elapsed 
0.039   0.000   0.040

length(bigTest)
[1] 2135

## Over 1 million total combinations
comboCount(c(0, s), 20, freqs = c(19, rep(1, 20)))
[1] 1048575

Here is a sample of the output from bigTest

bigTest[1:5]
[[1]]
[1] -41 -38 -28 -22 -10  -5  11  12  14  19  30  31  33  38  41  42  43

[[2]]
[1] -41 -38 -28 -22  -9  -5  -1  11  12  14  19  30  31  33  38  41  42  43

[[3]]
[1] -41 -38 -28 -22  -1  11  12  19  30  31  33  38  41  42  43

[[4]]
[1] -41 -38 -28 -11 -10  -5  12  14  19  30  31  33  38  41  42  43

[[5]]
[1] -41 -38 -28 -11  -9  -5  -1  12  14  19  30  31  33  38  41  42  43

## View the last 5 combinations
bigTest[length(bigTest) - 4:0]
[[1]]
[1] 12 14 31 33 38 42

[[2]]
[1] 14 19 30 31 33 43

[[3]]
[1] 11 12 14 19 30 41 43

[[4]]
[1] 11 12 14 19 31 41 42

[[5]]
[1] 11 12 14 19 33 38 43


all(sapply(bigTest, sum) == 170)
[1] TRUE

You can also find combinations that have a sum between a range. It looks like the OP has this mind originally as the OP makes use of between from dplyr (at least I think it is from dplyr).

sumRange <- comboGeneral(c(0, s), 20, freqs = c(19, rep(1, 20)), 
                         constraintFun = "sum", 
                         comparisonFun = c(">", "<"), 
                         limitConstraints = c(155, 165))

all(rowSums(sumRange) > 155 & rowSums(sumRange) < 165)
[1] TRUE
Joseph Wood
  • 7,077
  • 2
  • 30
  • 65