Until now, I have found that generating permutations using iterpc
is the fastest approach. An example usage could be:
library(iterpc)
set.seed(143)
dat <- sample(LETTERS[1:4], 10, replace = TRUE)
np_multiset(table(dat), length(dat))
# [1] 18900
I <- iterpc(table(dat), order=TRUE)
out <- getall(I)
getnext(I)
# [1] A A A A B B C C D D
# Levels: A B C D
getcurrent(I)
# [1] A A A A B B C C D D
# Levels: A B C D
The resulting matrix would be 18900 by 10, which is large to be stored in a single matrix. With the help of getnext(I, 1000)
, I can get the permutations in chunks of 1000 and work based on that. However, all these permutations are ordered with labels. Is there any way to sample 1000 from the set of 18900 in a random order than in sequence?
Expected output:(But, without generating all permutations out
)
Isam <- sample(18900, 10)
# [1] 15746 18026 17881 18687 7513 1975 5575 2845 1275 10207
out[Isam,]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] "B" "A" "A" "A" "D" "C" "C" "B" "A" "D"
# [2,] "B" "D" "A" "A" "C" "D" "A" "C" "B" "A"
# [3,] "B" "A" "A" "B" "C" "A" "A" "D" "C" "D"
# [4,] "A" "C" "A" "C" "D" "B" "A" "B" "A" "D"
# [5,] "C" "D" "A" "A" "A" "C" "B" "B" "D" "A"
# [6,] "A" "B" "A" "D" "A" "D" "A" "B" "C" "C"
# [7,] "B" "A" "A" "D" "B" "C" "C" "A" "A" "D"
# [8,] "A" "A" "D" "C" "B" "D" "A" "A" "C" "B"
# [9,] "D" "C" "A" "C" "D" "B" "A" "B" "A" "A"
# [10,] "C" "D" "D" "A" "A" "A" "C" "B" "B" "A"