This is simply permutations of the multiset 0:1
. There are a couple of libraries capable for handling these efficiently: RcppAlgos
(I am the author) and arrangements
.
RcppAlgos::permuteGeneral(1:0, freqs = c(3, 3))
arrangements::permutations(x = 1:0, freq = c(3, 3))
Both give the desired result. You will note that the vector passed is in descending order (i.e. 1:0
). This is so, because both of the libraries produce their output in lexicographical order.
As noted in the comments, for your real data none of the posted solutions will work as the number of results is way too big.
RcppAlgos::permuteCount(0:1, freqs = c(100,100))
[1] 9.054851e+58
arrangements::npermutations(x = 0:1, freq = c(100, 100), bigz = TRUE)
Big Integer ('bigz') :
[1] 90548514656103281165404177077484163874504589675413336841320
Since generating this amount of data at one time is simply not feasible, both of the packages, arrangements
and RcppAlgos
, offer alternative approaches that will allow one to tackle larger problems.
arrangements
For the package arrangements
, you can set up an iterator that allows the user to generate combinations/permutations n at a time avoiding the overhead of generating all of them.
library(arrangements)
iperm <- ipermutations(x = 1:0, freq = c(3,3))
## get the first 5 permutations
iperm$getnext(d = 5)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 1 0 0 0
[2,] 1 1 0 1 0 0
[3,] 1 1 0 0 1 0
[4,] 1 1 0 0 0 1
[5,] 1 0 1 1 0 0
## get the next 5 permutations
iperm$getnext(d = 5)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 0 1 0 1 0
[2,] 1 0 1 0 0 1
[3,] 1 0 0 1 1 0
[4,] 1 0 0 1 0 1
[5,] 1 0 0 0 1 1
RcppAlgos
For RcppAlgos
, there are arguments lower
and upper
that allow for generations of specific chunks.
library(RcppAlgos)
permuteGeneral(1:0, freqs = c(3,3), lower = 1, upper = 5)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 1 0 0 0
[2,] 1 1 0 1 0 0
[3,] 1 1 0 0 1 0
[4,] 1 1 0 0 0 1
[5,] 1 0 1 1 0 0
permuteGeneral(1:0, freqs = c(3,3), lower = 6, upper = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 0 1 0 1 0
[2,] 1 0 1 0 0 1
[3,] 1 0 0 1 1 0
[4,] 1 0 0 1 0 1
[5,] 1 0 0 0 1 1
As these chunks are generated independently, one can easily generate and analyze in parallel:
library(parallel)
mclapply(seq(1,20,5), function(x) {
a <- permuteGeneral(1:0, freqs = c(3,3), lower = x, upper = x + 4)
## Do some analysis
}, mc.cores = detectCores() - 1)
You will not notice any speedup for this small example, but there is a noticeable gain as the number of results get large.
There is a lot more information on this topic in a summary I wrote to the question: R: Permutations and combinations with/without replacement and for distinct/non-distinct items/multiset.