2

I'd like to randomly assign subjects into two equally sized groups and see all possible outcomes using R.

For instance, suppose there are 10 subjects, and I like to allocate them into the Treatment and Control groups. Then, there are (10!)/5!5! ways of assigning the subjects to the two groups. Instead of seeing one random result, I want to see all possible results. Ideally, I want to see the results something like the below

[1] T T T T T C C C C C
[2] T T T T C T C C C C
     (omitted)
[252] C C C C C T T T T T  

C: control group, T: treatment group.

Are there any R functions that can achieve this goal? Thank you

Rohit
  • 1,967
  • 1
  • 12
  • 15
hslee
  • 23
  • 3
  • check out `combn`. e.g. `cbind(t(combn(1:10, 5)), t(apply(t(combn(1:10, 5)), 1L, function(x) setdiff(1:10, c(x)))))` then first 5 columns are in T and last 5 columns are in C – chinsoon12 Mar 05 '20 at 09:08
  • Possible duplicate: https://stackoverflow.com/questions/11095992/generating-all-distinct-permutations-of-a-list-in-r – Rohit Mar 05 '20 at 10:06

2 Answers2

1

Suggested solution using base R: First we create a matrix with the indexes for "C" control (combn(N_observation, floor(N_observation / 2))) and, using apply, pass each column of this "index matrix" to a function where we first create a vector of "T" s and use the indexes to change the "T" to a "C". Finally using another apply collapse columns into strings:

f <- function(N_observation) {
  apply(
    apply(combn(N_observation, floor(N_observation / 2)), 2, function(x) {
      vec <- rep("T", N_observation)
      vec[x] <- "C"
      return(vec)
    }), 2, paste0, collapse="")
}

f(4)

Returns:

[1] "CCTT" "CTCT" "CTTC" "TCCT" "TCTC" "TTCC"
dario
  • 6,415
  • 2
  • 12
  • 26
0

I am not sure if you are looking for this, I am using gtools::permutations. since permutations are always superset of combination of 'C' and 'T', we shall filter only those rows where count of 'C' and 'T' are equal.

Let me know if my understanding is not clear and solution doesn't work for you.

library(gtools)
grps <- c('C', 'T')
n <- 10
p = permutations(length(grps), n, grps, repeats.allowed = TRUE)
data.frame(p[(rowSums(p == 'C') == n/length(grps)),], stringsAsFactors=FALSE)

Output for few rows:

    X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1    C  C  C  C  C  T  T  T  T   T
2    C  C  C  C  T  C  T  T  T   T
3    C  C  C  C  T  T  C  T  T   T
4    C  C  C  C  T  T  T  C  T   T
5    C  C  C  C  T  T  T  T  C   T
6    C  C  C  C  T  T  T  T  T   C
7    C  C  C  T  C  C  T  T  T   T
8    C  C  C  T  C  T  C  T  T   T
9    C  C  C  T  C  T  T  C  T   T

In case you want to paste these rows into a sepearate vector then probably you can use do.call :

 grps <- c('C', 'T')
    n <- 10
    p = permutations(length(grps), n, grps, repeats.allowed = TRUE)
    dfs <- data.frame(p[(rowSums(p == 'C') == n/length(grps)),], stringsAsFactors=FALSE)
    do.call('paste0', dfs)

Output for few vectors:

  [1] "CCCCCTTTTT" "CCCCTCTTTT" "CCCCTTCTTT" "CCCCTTTCTT"
  [5] "CCCCTTTTCT" "CCCCTTTTTC" "CCCTCCTTTT" "CCCTCTCTTT"
  [9] "CCCTCTTCTT" "CCCTCTTTCT" "CCCTCTTTTC" "CCCTTCCTTT"
PKumar
  • 10,971
  • 6
  • 37
  • 52
  • 1
    Wow, this is exactly what I wanted to have. Thank you. – hslee Mar 06 '20 at 03:03
  • My bad. I wanted to accept it but I did not know how to proceed. By the way, the original tick mark is more like grey. Let me know if there are any problems. – hslee Mar 06 '20 at 10:15
  • 1
    @hslee, you can read this..https://stackoverflow.com/help/someone-answers, you can upvote an answer, and only one answer can be accepted also you can upvote, there is a difference between upvoting and accepting an aswer read the link, you will understand – PKumar Mar 06 '20 at 10:16
  • I have another related question. Instead of using the concept, distinguishable permutation, Would you be able to partition n distinct objects into r distinct groups? Using the same example as the original post, suppose there are 10 distinct subjects, and I would like to allocate them in two equally sized groups. So, instead of starting from grps <- c('C', 'T') , can you start from grps<-c(1:10) and this and produce the intended results? – hslee Mar 06 '20 at 12:14
  • @hslee, as long as n(10) is evenly divided by length of items in grp (in this case 2), the above code should work perfectly, but in case , assume n = 100, but grp is c(1,2,3), the above code won't work as 100/3 is 33.33 and is not evenly divided. so take a case n=100 and grps<- c(1:10) then the code should work. Also reiterating again, if you are satisfied with solution you may upvote also by checking the up triangle – PKumar Mar 06 '20 at 12:18
  • Your code is perfectly fine, and my original question was allocating subjects into equally sized groups. Thus, "100/3 is 33.33" is not my concern. Is it okay for me to understand your code as partitioning 10 distinct subjects into 2 distinct groups? I mean at least in the current case, distinguishable permutation and partitioning are same and yield the mathematically same result. However, my additional question was more conceptual. – hslee Mar 06 '20 at 12:33
  • 1
    @hslee, yes you are right, but it will also work in cases where n is evenly divided by r... – PKumar Mar 06 '20 at 12:45