2

This question is very similar to this one, but instead of putting one item from each original group into each output group, I want multiple items from each original group in each output group.

I have the following data.

list1 <- list(Group_1 = c("1", "2", "3", "4", "5", "6"), Group_2 = c("13", "14", "15", "16", "17", "18"), Group_3 = c("19", "20", "21", "22", "23", "24", "25"))
Number_of_Items_From_Each_Original_Group_to_End_up_in_Each_Output_Group <- 2

Group_1 contains 6 items, Group_2 contains 6 items, and Group_3 contains 7 items. Based on these initial groups, I want to place items into 3 new groups, making sure that each new group contains the same number (2, or Number_of_Items_From_Each_Original_Group_to_End_up_in_Each_Output_Group) of items in it from each original group. Furthermore, none of the items can be repeated in each of these 3 new groups - each item can only be used once for each of the new groups. For example, one possible output could be this output below.

(list(New_Group_1 = c("1", "2", "13", "14", "19", "20"), New_Group_2 = c("3", "4", "15", "16", "21", "22"), New_Group_3 = c("5", "6", "17", "18", "23", "24")))

I'd actually like to generate a list of output that shows every possible output (every possible combination). For each list element, there should be three new groups, and each group should have distinct items in it. For example, in addition to the solution above, another one could be the following.

(list(New_Group_1 = c("3", "4", "13", "14", "19", "20"), New_Group_2 = c("5", "6", "15", "16", "21", "22"), New_Group_3 = c("1", "2", "17", "18", "23", "24")))

The question I asked here is very similar, except in that question, I only put one item from each original group into each new group. Here, I'd like to have an option of putting more than 1 item from each original group into each new group.

David Moore
  • 670
  • 3
  • 15

2 Answers2

3

This is fairly tricky, and quite slow, since there are 23,625 possible combinations of the three groups as specified

n <- 2

all_pairs <- lapply(list1, function(group) {
  all_combs <- apply(combn(ncol(combn(length(group), n)), length(list1)), 
                     2,
        function(x) c(combn(length(group), n)[,x]), simplify = FALSE)
  
  all_combs[sapply(all_combs, function(x) !any(duplicated(x)))]
})

index_df <- do.call(expand.grid, lapply(all_pairs, function(x) seq(length(x))))

result <- apply(index_df, 1, function(x) {
  m <- t(sapply(seq_along(x), function(y) list1[[y]][all_pairs[[y]][[x[y]]]]))
  setNames(lapply(split(m, (seq_along(m)-1) %/% (length(list1) * n)),
         function(x) c(t(matrix(x, ncol = n)))), names(list1))
}, simplify = FALSE)

The resulting list looks as follows:

head(result)
#> [[1]]
#> [[1]]$Group_1
#> [1] "1"  "2"  "13" "14" "19" "20"
#> 
#> [[1]]$Group_2
#> [1] "3"  "4"  "15" "16" "21" "22"
#> 
#> [[1]]$Group_3
#> [1] "5"  "6"  "17" "18" "23" "24"
#> 
#> 
#> [[2]]
#> [[2]]$Group_1
#> [1] "1"  "2"  "13" "14" "19" "20"
#> 
#> [[2]]$Group_2
#> [1] "3"  "5"  "15" "16" "21" "22"
#> 
#> [[2]]$Group_3
#> [1] "4"  "6"  "17" "18" "23" "24"
#> 
#> 
#> [[3]]
#> [[3]]$Group_1
#> [1] "1"  "2"  "13" "14" "19" "20"
#> 
#> [[3]]$Group_2
#> [1] "3"  "6"  "15" "16" "21" "22"
#> 
#> [[3]]$Group_3
#> [1] "4"  "5"  "17" "18" "23" "24"
#> 
#> 
#> [[4]]
#> [[4]]$Group_1
#> [1] "1"  "3"  "13" "14" "19" "20"
#> 
#> [[4]]$Group_2
#> [1] "2"  "4"  "15" "16" "21" "22"
#> 
#> [[4]]$Group_3
#> [1] "5"  "6"  "17" "18" "23" "24"
#> 
#> 
#> [[5]]
#> [[5]]$Group_1
#> [1] "1"  "3"  "13" "14" "19" "20"
#> 
#> [[5]]$Group_2
#> [1] "2"  "5"  "15" "16" "21" "22"
#> 
#> [[5]]$Group_3
#> [1] "4"  "6"  "17" "18" "23" "24"
#> 
#> 
#> [[6]]
#> [[6]]$Group_1
#> [1] "1"  "3"  "13" "14" "19" "20"
#> 
#> [[6]]$Group_2
#> [1] "2"  "6"  "15" "16" "21" "22"
#> 
#> [[6]]$Group_3
#> [1] "4"  "5"  "17" "18" "23" "24"

Created on 2022-06-09 by the reprex package (v2.0.1)

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • I should have been more specific - I want there to be three output groups as well, each with distinct items in it. I'll update my question. – David Moore Jun 09 '22 at 20:13
  • @DavidMoore in that case, why not just `sample` from the above list? – Allan Cameron Jun 09 '22 at 20:17
  • I just posted a similar question at https://stackoverflow.com/questions/72912733/determining-all-possible-combinations-of-items-with-a-grouping-variable-allowin - any help is greatly appreciated! – David Moore Jul 08 '22 at 14:17
1

Probably you can define a function f like below (which generates all permutations grouped by n elements out of v)

f <- function(v, n = 2) {
    if (length(v) < n) {
        return(list(NULL))
    }
    if (length(v) == n) {
        return(list(v))
    }
    x <- combn(v, n, simplify = FALSE)
    unlist(lapply(x, function(p) Map(rbind, list(p), f(v[!v %in% p], n))), recursive = FALSE)
}

res <- apply(expand.grid(lapply(list1, f)), 1, function(x) asplit(do.call(cbind, x), 1))

and a subset of result looks like

> apply(head(expand.grid(lapply(list1, f))), 1, fun .... [TRUNCATED]
$`1`
$`1`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`1`[[2]]
[1] "3"  "4"  "15" "16" "21" "22"

$`1`[[3]]
[1] "5"  "6"  "17" "18" "23" "24"


$`2`
$`2`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`2`[[2]]
[1] "3"  "5"  "15" "16" "21" "22"

$`2`[[3]]
[1] "4"  "6"  "17" "18" "23" "24"


$`3`
$`3`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`3`[[2]]
[1] "3"  "6"  "15" "16" "21" "22"

$`3`[[3]]
[1] "4"  "5"  "17" "18" "23" "24"


$`4`
$`4`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`4`[[2]]
[1] "4"  "5"  "15" "16" "21" "22"

$`4`[[3]]
[1] "3"  "6"  "17" "18" "23" "24"


$`5`
$`5`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`5`[[2]]
[1] "4"  "6"  "15" "16" "21" "22"

$`5`[[3]]
[1] "3"  "5"  "17" "18" "23" "24"


$`6`
$`6`[[1]]
[1] "1"  "2"  "13" "14" "19" "20"

$`6`[[2]]
[1] "5"  "6"  "15" "16" "21" "22"

$`6`[[3]]
[1] "3"  "4"  "17" "18" "23" "24"
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • I just posted a similar question at https://stackoverflow.com/questions/72912733/determining-all-possible-combinations-of-items-with-a-grouping-variable-allowin - any help is greatly appreciated! – David Moore Jul 08 '22 at 14:17