1

Given an ordered vector of strings where each string shows the possible characters in that position, how can I get all possible combinations of strings?

For example, given the vector:

vec <- c("A", "A", "T", "C", "AG", "ACG", "T", "A", "A")

The possible string combinations, given positions 5 can be either "A" or "G", and 6 can be "A", "C", or "G" are:

strings <- c("AATCAATAA"
             "AATCACTAA"
             "AATCAGTAA"
             "AATCGATAA"
             "AATCGCTAA"
             "AATCGGTAA")
Powege
  • 685
  • 5
  • 12

1 Answers1

0

Split your vector into individual characters, then use expand.grid():

vec <- c("A", "A", "T", "C", "AG", "ACG", "T", "A", "A")

strings <- expand.grid(strsplit(vec, ""), stringsAsFactors = FALSE)
strings
#>   Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9
#> 1    A    A    T    C    A    A    T    A    A
#> 2    A    A    T    C    G    A    T    A    A
#> 3    A    A    T    C    A    C    T    A    A
#> 4    A    A    T    C    G    C    T    A    A
#> 5    A    A    T    C    A    G    T    A    A
#> 6    A    A    T    C    G    G    T    A    A

This gives us a data frame, but we can paste the rows together to get a single vector:

apply(strings, 1, paste0, collapse = "")
#> [1] "AATCAATAA" "AATCGATAA" "AATCACTAA" "AATCGCTAA" "AATCAGTAA" "AATCGGTAA"
Joe Roe
  • 626
  • 3
  • 12