1

Until now, I have found that generating permutations using iterpc is the fastest approach. An example usage could be:

library(iterpc)
set.seed(143)

dat <- sample(LETTERS[1:4], 10, replace = TRUE)
np_multiset(table(dat), length(dat))
# [1] 18900

I <- iterpc(table(dat), order=TRUE)
out <- getall(I)

getnext(I)
#   [1] A A A A B B C C D D
# Levels: A B C D

getcurrent(I)
#   [1] A A A A B B C C D D
# Levels: A B C D

The resulting matrix would be 18900 by 10, which is large to be stored in a single matrix. With the help of getnext(I, 1000), I can get the permutations in chunks of 1000 and work based on that. However, all these permutations are ordered with labels. Is there any way to sample 1000 from the set of 18900 in a random order than in sequence?

Expected output:(But, without generating all permutations out)

Isam <- sample(18900, 10)
# [1] 15746 18026 17881 18687  7513  1975  5575  2845  1275 10207

out[Isam,]
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,] "B"  "A"  "A"  "A"  "D"  "C"  "C"  "B"  "A"  "D"  
#  [2,] "B"  "D"  "A"  "A"  "C"  "D"  "A"  "C"  "B"  "A"  
#  [3,] "B"  "A"  "A"  "B"  "C"  "A"  "A"  "D"  "C"  "D"  
#  [4,] "A"  "C"  "A"  "C"  "D"  "B"  "A"  "B"  "A"  "D"  
#  [5,] "C"  "D"  "A"  "A"  "A"  "C"  "B"  "B"  "D"  "A"  
#  [6,] "A"  "B"  "A"  "D"  "A"  "D"  "A"  "B"  "C"  "C"  
#  [7,] "B"  "A"  "A"  "D"  "B"  "C"  "C"  "A"  "A"  "D"  
#  [8,] "A"  "A"  "D"  "C"  "B"  "D"  "A"  "A"  "C"  "B"  
#  [9,] "D"  "C"  "A"  "C"  "D"  "B"  "A"  "B"  "A"  "A"  
# [10,] "C"  "D"  "D"  "A"  "A"  "A"  "C"  "B"  "B"  "A" 
Prradep
  • 5,506
  • 5
  • 43
  • 84
  • I don't understand why you can't just use sample to create a permutation when you need one? – Roland Oct 29 '17 at 18:23
  • I tried that approach (eg: `replicate(9000, sample(dat, length(dat), FALSE))`). Out of ten such executions, all 10 had less than 6.5k unique permutations. I couldn't get enough unique permutations. – Prradep Oct 29 '17 at 20:43

0 Answers0