0

I have the following code that generate the unique permutation:

library(magrittr)
library(tictoc)

count_unique_perm <- function(l = NULL) {
  lo <- combinat::permn(l)
  do.call(rbind, lapply(lo, paste0, collapse = ""))[, 1] %>%
    unique() %>%
    length()
}

It already give the correct result. With this input:

l1 <- c("R", "R", "R", "R", "R", "R", "R", "E", "K", "P") # 720
l2 <- c("R", "R", "R", "R", "R", "Q", "G", "K", "M", "S") # 30,240

But it's running extremely slow.

tic()
count_unique_perm(l = l1)
toc()
#118.155 sec elapsed

#107.793 sec elapsed for l2

How can I speed it up?

littleworth
  • 4,781
  • 6
  • 42
  • 76
  • I couldn't replicate your speed. When I ran your code it took 23.51 seconds. Maybe try restarting your session with a clean environment and memory? – Hansel Palencia Nov 09 '22 at 12:30
  • 1
    Similar question here: [Permute all unique enumerations of a vector in R](https://stackoverflow.com/q/5671149/4408538). I provided an answer with newer packages (`arrangements` and `RcppAlgos` which I authored) targeted at this type of problem: https://stackoverflow.com/a/60243362/4408538 – Joseph Wood Nov 09 '22 at 15:53

2 Answers2

5

You don't need to generate permutations, there is closed formula. You can use package iterpc:

iterpc::multichoose(table(l1))

or in base:

factorial(length(l2)) / prod(factorial(table(l2)))
det
  • 5,013
  • 1
  • 8
  • 16
  • Thanks. How can I generate the actual permutation strings/vector with `iterpc::multichoose` ? Just like solution provided by @jblood94 below. – littleworth Nov 09 '22 at 13:42
  • You can't. It returns only number of possible different permutations. Title and provided example don't depict same problem. If you are interested in generation of those permutations see @jblood94 answer. – det Nov 09 '22 at 13:50
  • @jblood94 answer is not unique. See my response to his post. – littleworth Nov 09 '22 at 13:58
  • 1
    Please note that `iterpc` is no longer being developed. Per CRAN's package info on [iterpc](https://cran.r-project.org/package=iterpc): _"Users are recommended to switch to 'arrangements'"_. For example one could do `x <- table(l1); arrangements::npermutations(names(x), freq = x)` – Joseph Wood Nov 09 '22 at 15:48
  • @JosephWood how can I show the list of 720 permutations for `l1` using arrangements? – littleworth Nov 10 '22 at 00:36
  • 1
    @littleworth, simply remove the `n`, that is: `arrangements::permutations(names(x), freq = x)`. You can find much better examples in the documentation and in some answers here on SO. – Joseph Wood Nov 10 '22 at 10:39
5

Try the RcppAlgos package, which will return permutations of multisets by using the freqs argument.

library(RcppAlgos)
library(microbenchmark)

# get a matrix of unique permutations
x <- table(c("R", "R", "R", "R", "R", "R", "R", "E", "K", "P"))
y <- table(c("R", "R", "R", "R", "R", "Q", "G", "K", "M", "S"))

microbenchmark(permx = permuteGeneral(names(x), freqs = x),
               permy = permuteGeneral(names(y), freqs = y))
#> Unit: microseconds
#>   expr    min     lq     mean  median      uq    max neval
#>  permx   32.3   38.0   44.018   42.05   47.95   64.8   100
#>  permy 1538.8 1567.7 1751.259 1606.60 1649.35 5082.5   100
dim(permuteGeneral(names(x), freqs = x))
#> [1] 720  10
dim(permuteGeneral(names(y), freqs = y))
#> [1] 30240    10

To get just the number of unique permutations, use permuteCount.

microbenchmark(permx = permuteCount(names(x), freqs = x),
               permy = permuteCount(names(y), freqs = y))
#> Unit: microseconds
#>   expr min  lq  mean median  uq  max neval
#>  permx 1.5 1.6 1.791    1.6 1.8  6.6   100
#>  permy 1.5 1.6 2.260    1.7 1.8 46.2   100
permuteCount(names(x), freqs = x)
#> [1] 720
permuteCount(names(y), freqs = y)
#> [1] 30240
jblood94
  • 10,340
  • 1
  • 10
  • 15