The package RcppAlgos
(I am the author) was built specifically for this task. Below is a solution that utilizes comboGeneral
which is used to generate combinations with or without repetition as well as of multisets. We can also constrain the output with the use of constraintFun
, comparisonFun
, and limitConstraints
:
library(RcppAlgos)
mySubsetSum <- function(x, myTarget, repeats = FALSE) {
if (0 %in% x)
x <- x[-which(x == 0)]
if (repeats) {
tbl <- table(x)
vals <- as.numeric(names(tbl))
myfreqs <- unname(tbl)
} else {
vals <- unique(x)
myfreqs <- rep(1, length(vals))
}
## Add zero length(vals) - 1 times
myfreqs <- c(length(vals) - 1, myfreqs)
vals <- c(0L, vals)
combs <- comboGeneral(vals, length(vals), freqs = myfreqs,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = myTarget)
lapply(1:nrow(combs), function(x) combs[x, combs[x,] != 0])
}
mySubsetSum(x, 60)
[[1]]
[1] 26 34
mySubsetSum(x, 60, TRUE)
[[1]]
[1] 26 34
[[2]]
[1] 13 13 34
It is written in C++
for maximum speed. Here is an example demonstrating it's efficiency:
set.seed(42)
s <- sample(-50:50, 20)
s
[1] 42 43 -22 31 12 -1 19 -38 11 14 -9 41 33 -28 -10 30 38 -41 -11 -5
system.time(bigTest <- mySubsetSum(s, 170))
user system elapsed
0.039 0.000 0.040
length(bigTest)
[1] 2135
## Over 1 million total combinations
comboCount(c(0, s), 20, freqs = c(19, rep(1, 20)))
[1] 1048575
Here is a sample of the output from bigTest
bigTest[1:5]
[[1]]
[1] -41 -38 -28 -22 -10 -5 11 12 14 19 30 31 33 38 41 42 43
[[2]]
[1] -41 -38 -28 -22 -9 -5 -1 11 12 14 19 30 31 33 38 41 42 43
[[3]]
[1] -41 -38 -28 -22 -1 11 12 19 30 31 33 38 41 42 43
[[4]]
[1] -41 -38 -28 -11 -10 -5 12 14 19 30 31 33 38 41 42 43
[[5]]
[1] -41 -38 -28 -11 -9 -5 -1 12 14 19 30 31 33 38 41 42 43
## View the last 5 combinations
bigTest[length(bigTest) - 4:0]
[[1]]
[1] 12 14 31 33 38 42
[[2]]
[1] 14 19 30 31 33 43
[[3]]
[1] 11 12 14 19 30 41 43
[[4]]
[1] 11 12 14 19 31 41 42
[[5]]
[1] 11 12 14 19 33 38 43
all(sapply(bigTest, sum) == 170)
[1] TRUE
You can also find combinations that have a sum between a range. It looks like the OP has this mind originally as the OP makes use of between
from dplyr
(at least I think it is from dplyr
).
sumRange <- comboGeneral(c(0, s), 20, freqs = c(19, rep(1, 20)),
constraintFun = "sum",
comparisonFun = c(">", "<"),
limitConstraints = c(155, 165))
all(rowSums(sumRange) > 155 & rowSums(sumRange) < 165)
[1] TRUE