4

I have following data

A   B            C             D
1   1501583974   <list [3]>  <tibble>
1   1501616585   <list [3]>  <tibble>
1   1501583344   <list [3]>  <tibble>
1   1501573386   <list [3]>  <tibble>

Code I have used

data %>%
unnest_wider(c, names_sep="_") %>%
unnest_wider(d, names_sep="_")

Gives output

    A   B            C_1  C_2 C_3  D_1         D_2   
    1   1501583974   1    2   3   <list [1]>  <list [1]>
    1   1501616585   1    2   3   <list [1]>  <list [1]>
    1   1501583344   1    2   3   <list [1]>  <list [1]>
    1   1501573386   1    2   3   <list [1]>  <list [1]>

and then unnest_wider() again on all the columns is very tedious.
How to design a loop which works until all the columns with lists are not unnested?

Thanks

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Sep 08 '20 at 02:41
  • Please add `dput(data)` to your question. – Ronak Shah Sep 08 '20 at 03:29

1 Answers1

4

Here is a data frame with nested list and data frame columns.

library(tidyverse)

l <- list(y1 = 1, y2 = list(z1 = 1))

data <- tribble(
  ~x1,  ~list1,     ~tibble1,
    1,       l, as.tibble(l),
    1,       l, as.tibble(l),
    1,       l, as.tibble(l),
    1,       l, as.tibble(l)
)
data
#> # A tibble: 4 x 3
#>      x1 list1            tibble1         
#>   <dbl> <list>           <list>          
#> 1     1 <named list [2]> <tibble [1 × 2]>
#> 2     1 <named list [2]> <tibble [1 × 2]>
#> 3     1 <named list [2]> <tibble [1 × 2]>
#> 4     1 <named list [2]> <tibble [1 × 2]>

We can create a function, unnest_all, which recursively unnests all the list columns.

  • First, it finds all the list columns.
  • Then, if there are any list columns, it unnests each of them.
  • Finally, it calls unnest_all again to unnest any remaining list columns.
unnest_all <- function(df) {
  list_columns <- df %>% keep(is.list) %>% names()
  
  if (length(list_columns) == 0) {
    return(df)
  }

  for (list_column in list_columns) {
    df <-
      df %>%
      unnest_wider(list_column, names_sep = "_")
  }
  unnest_all(df)
}
unnest_all(data)
#> # A tibble: 4 x 5
#>      x1 list1_y1 list1_y2_z1 tibble1_y1 tibble1_y2_z1
#>   <dbl>    <dbl>       <dbl>      <dbl>         <dbl>
#> 1     1        1           1          1             1
#> 2     1        1           1          1             1
#> 3     1        1           1          1             1
#> 4     1        1           1          1             1
Paul
  • 8,734
  • 1
  • 26
  • 36
  • i got the following error - Error in names(x) <- paste0(col, names_sep, index(x)) : 'names' attribute [1] must be the same length as the vector [0] – Rohan Kataria Sep 08 '20 at 14:38
  • 2
    Your dataset has empty lists which cannot be unnested. Modify the function to ignore empty lists: `list_columns <- df %>% keep(is.list) %>% discard(~any(map_lgl(., is_empty))) %>% names()`. – Paul Sep 08 '20 at 14:51
  • I had no idea you could call a function within itself. I can't even understand how this is able to work. – Phil Dec 14 '20 at 19:43
  • As a side-effect it also recognizes nested frames not only lists: `x <- data.frame(driver = c("Bowser", "Peach"), occupation = c("Koopa", "Princess")) x$vehicle <- data.frame(model = c("Piranha Prowler", "Royal Racer")) x$vehicle$stats <- data.frame(speed = c(55, 34), weight = c(67, 24), drift = c(35, 32))` `x %>% keep(is.list) %>% names()` returns `"vehicle"` ie: a dataframe – PerseP Sep 22 '22 at 11:38
  • @Phil It's called a recursive function – PerseP Sep 22 '22 at 13:30