7

I have a vector of elements named markers of the form:

markers <- LETTERS[1:5]

Each element in markers is of Boolean type with two possible conditions + and -. I would like a fast an efficient way to obtain all possible combinations, so that the two conditions are considered (a marker with cannot be paired with itself even if the condition is different).

The result would ideally be just a character vector or a list, where its elements are the marker combinations separated by /.

The elements for this example with five letters should be:

A-/B-/C-/D-/E-
A-/B+/C-/D-/E-
A-/B-/C+/D-/E-
A-/B-/C-/D+/E-
A-/B-/C-/D-/E+
A-/B+/C+/D-/E-
A-/B+/C-/D+/E-
A-/B+/C-/D-/E+
A-/B+/C+/D+/E-
A-/B+/C+/D-/E+
A-/B+/C+/D+/E+
A+/B-/C-/D-/E-
A+/B+/C-/D-/E-
A+/B-/C+/D-/E-
A+/B-/C-/D+/E-
A+/B-/C-/D-/E+
A+/B+/C+/D-/E-
A+/B+/C-/D+/E-
A+/B+/C-/D-/E+
A+/B+/C+/D+/E-
A+/B+/C+/D-/E+
A+/B+/C+/D+/E+
...

Not sure if I'm missing any combination, but you get the idea... I've been trying with expand.grid and combn but I don't seem to get it right. Any help appreciated!

Thanks!

DaniCee
  • 2,397
  • 6
  • 36
  • 59
  • it's just that I start off with a vector like `c('A','B','C'...)` but each element can adopt 2 possible forms `-` or `+`, as in 'A+', 'A-', 'B+', 'B-', etc. And I would like that reflected in the resulting combination vector (but `A-` cannot combine with `A+` for example) – DaniCee Sep 24 '20 at 06:31

2 Answers2

6
markers <- LETTERS[1:5]

test <- expand.grid(lapply(seq(markers), function(x) c("+","-")),stringsAsFactors=FALSE)

> test
   Var1 Var2 Var3 Var4 Var5
1     +    +    +    +    +
2     -    +    +    +    +
3     +    -    +    +    +
4     -    -    +    +    +
 ....


apply(test,1,function(x){paste0(markers,x,collapse = "/")}) 


 [1] "A+/B+/C+/D+/E+" "A-/B+/C+/D+/E+" "A+/B-/C+/D+/E+" "A-/B-/C+/D+/E+" "A+/B+/C-/D+/E+" "A-/B+/C-/D+/E+" "A+/B-/C-/D+/E+"
 [8] "A-/B-/C-/D+/E+" "A+/B+/C+/D-/E+" "A-/B+/C+/D-/E+" "A+/B-/C+/D-/E+" "A-/B-/C+/D-/E+" "A+/B+/C-/D-/E+" "A-/B+/C-/D-/E+"
[15] "A+/B-/C-/D-/E+" "A-/B-/C-/D-/E+" "A+/B+/C+/D+/E-" "A-/B+/C+/D+/E-" "A+/B-/C+/D+/E-" "A-/B-/C+/D+/E-" "A+/B+/C-/D+/E-"
[22] "A-/B+/C-/D+/E-" "A+/B-/C-/D+/E-" "A-/B-/C-/D+/E-" "A+/B+/C+/D-/E-" "A-/B+/C+/D-/E-" "A+/B-/C+/D-/E-" "A-/B-/C+/D-/E-"
[29] "A+/B+/C-/D-/E-" "A-/B+/C-/D-/E-" "A+/B-/C-/D-/E-" "A-/B-/C-/D-/E-"
denis
  • 5,580
  • 1
  • 13
  • 40
  • 1
    You may want to use: `stringsAsFactors=FALSE` in expand.grid (On R4.0.2 it is still set to TRUE). Then you can drop the mutate_all – zeehio Sep 24 '20 at 06:53
  • nice elegant solution! yeah, I was going to mention the stringsAsFactors argument too, mutate line not needed – DaniCee Sep 24 '20 at 07:07
  • thanks ! Did not know the stringsAsfactors arguement, helps a lot. edited – denis Sep 24 '20 at 07:19
2

To add to the excellent base R answer by @denis, here is a one-liner using RcppAlgos*. It should be a bit more efficient than the proposed solution:

n <- 5

RcppAlgos::permuteGeneral(c("+", "-"), n, repetition = TRUE, FUN = function(x) {
    paste0(LETTERS[1:n], x, collapse = "/")
})

[[1]]
[1] "A+/B+/C+/D+/E+"

[[2]]
[1] "A+/B+/C+/D+/E-"

[[3]]
[1] "A+/B+/C+/D-/E+"

.
.
.

[[30]]
[1] "A-/B-/C-/D+/E-"

[[31]]
[1] "A-/B-/C-/D-/E+"

[[32]]
[1] "A-/B-/C-/D-/E-"

It should be noted that the majority of computation is in dealing with manipulating character vectors. Thus, it will be difficult to achieve any sort of huge efficiency gain no matter what tool you use.

* I am the author

Joseph Wood
  • 7,077
  • 2
  • 30
  • 65