9

Let's say I have data test (dput given) where a list-col say items:

test <- structure(list(items = list('a', c('b', 'c'), c('d', 'e'), 'f', c('g', 'h')),
               ID = c(1,1,1,2,2)), row.names = c(NA, 5L), class = "data.frame")

library(tidyverse)
test %>% group_by(ID) %>%
  mutate(dummy = accumulate(items, ~paste(.x, .y)))

I am getting an output with list-col like this

  items ID        dummy
1     a  1            a
2  b, c  1     a b, a c
3  d, e  1 a b d, a c e
4     f  2            f
5  g, h  2     f g, f h

I would like there to be four items in row3, having each possible combination, i.e. c("a b d", "a b e", "a c d", "a c e"). It however doesn't matter if these are separate items in the list or not. In other words, the output of dummy may be of type multi-level list, where row3 will contain four items in the list. I tried using expand.grid, but I am doing something wrong somewhere!

So my desired output will look like

  items ID                      dummy
1     a  1                          a
2  b, c  1                   a b, a c
3  d, e  1 a b d, a c d, a b e, a c e
4     f  2                          f
5  g, h  2                   f g, f h
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45

6 Answers6

8

A data.table option using Reduce + outer

setDT(test)[
  ,
  dummy := .(Reduce(function(x, y) outer(x, y, paste),
    items,
    accumulate = TRUE
  )),
  ID
]

gives

> test
   items ID                   dummy
1:     a  1                       a
2:   b,c  1                 a b,a c
3:   d,e  1 a b d,a c d,a b e,a c e
4:     f  2                       f
5:   g,h  2                 f g,f h
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
7

Another approach with expand.grid(),

test %>% group_by(ID) %>%
mutate(dummy = accumulate(items, ~do.call("paste",expand.grid(.x, .y)))) %>% 
data.frame()

gives,

  items ID                      dummy
1     a  1                          a
2  b, c  1                   a b, a c
3  d, e  1 a b d, a c d, a b e, a c e
4     f  2                          f
5  g, h  2                   f g, f h
maydin
  • 3,715
  • 3
  • 10
  • 27
5

You can do this using an outer product to paste the two vectors...

test2 <- test %>% group_by(ID) %>%
  mutate(dummy = accumulate(items, ~outer(.x, .y, paste)))

str(test2)
grouped_df[,3] [5 x 3] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ items:List of 5
  ..$ : chr "a"
  ..$ : chr [1:2] "b" "c"
  ..$ : chr [1:2] "d" "e"
  ..$ : chr "f"
  ..$ : chr [1:2] "g" "h"
 $ ID   : num [1:5] 1 1 1 2 2
 $ dummy:List of 5
  ..$ : chr "a"
  ..$ : chr [1, 1:2] "a b" "a c"
  ..$ : chr [1, 1:2, 1:2] "a b d" "a c d" "a b e" "a c e"
  ..$ : chr "f"
  ..$ : chr [1, 1:2] "f g" "f h"
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32
5

If you want every possible combination use sapply over .x

library(dplyr)
library(purrr)

test %>% 
  group_by(ID) %>%
  mutate(dummy = accumulate(items, ~c(sapply(.x, paste, .y)))) %>%
  pull(dummy)

#[[1]]
#[1] "a"

#[[2]]
#[1] "a b" "a c"

#[[3]]
#[1] "a b d" "a b e" "a c d" "a c e"

#[[4]]
#[1] "f"

#[[5]]
#[1] "f g" "f h"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
3

There is also cross and cross2 from the purrr package:

library(tidyverse)

test %>% 
  group_by(ID) %>% 
  mutate(
    dummy = accumulate(items, cross2) %>% map_depth(unlist, .depth = 2)
  ) %>% 
  pull(dummy) %>% 
  str()
#> List of 5
#>  $ :List of 1
#>   ..$ : chr "a"
#>  $ :List of 2
#>   ..$ : chr [1:2] "a" "b"
#>   ..$ : chr [1:2] "a" "c"
#>  $ :List of 4
#>   ..$ : chr [1:3] "a" "b" "d"
#>   ..$ : chr [1:3] "a" "c" "d"
#>   ..$ : chr [1:3] "a" "b" "e"
#>   ..$ : chr [1:3] "a" "c" "e"
#>  $ :List of 1
#>   ..$ : chr "f"
#>  $ :List of 2
#>   ..$ : chr [1:2] "f" "g"
#>   ..$ : chr [1:2] "f" "h"

Created on 2021-05-18 by the reprex package (v1.0.0)

Peter H.
  • 1,995
  • 8
  • 26
2

This solution can also be used:

library(dplyr)
library(purrr)

test %>%
  group_by(ID) %>%
  mutate(comb = accumulate(items[-1], .init = unlist(items[1]), 
                           ~ expand.grid(.x, .y) %>% 
                             {map2(.$Var1, .$Var2, ~ paste(.x, .y, sep = " "))} %>%
                             unlist())) %>%
  as.data.frame()

  items ID                       comb
1     a  1                          a
2  b, c  1                   a b, a c
3  d, e  1 a b d, a c d, a b e, a c e
4     f  2                          f
5  g, h  2                   f g, f h
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41