41

Can I unnest a list column directly into n columns?

The list can be assumed to regular, with all elements being of equal length.

If instead of a list column I would have a character vector, I could tidyr::separate. I can tidyr::unnest, but we need another helper variable to be able to tidyr::spread. Am I missing an obvious method?

Example data:

library(tibble)

df1 <- data_frame(
  gr = c('a', 'b', 'c'),
  values = list(1:2, 3:4, 5:6)
)
# A tibble: 3 x 2
  gr    values   
  <chr> <list>   
1 a     <int [2]>
2 b     <int [2]>
3 c     <int [2]>

Goal:

df2 <- data_frame(
  gr = c('a', 'b', 'c'),
  V1 = c(1, 3, 5),
  V2 = c(2, 4, 6)
)
# A tibble: 3 x 3
  gr       V1    V2
  <chr> <dbl> <dbl>
1 a        1.    2.
2 b        3.    4.
3 c        5.    6.

Current method:

unnest(df1) %>% 
  group_by(gr) %>% 
  mutate(r = paste0('V', row_number())) %>% 
  spread(r, values)
Axeman
  • 32,068
  • 8
  • 81
  • 94

6 Answers6

43

with tidyr 1.0.0 you can do :

library(tidyr)
df1 <- tibble(
  gr = c('a', 'b', 'c'),
  values = list(1:2, 3:4, 5:6)
)

unnest_wider(df1, values)
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> # A tibble: 3 x 3
#>   gr     ...1  ...2
#>   <chr> <int> <int>
#> 1 a         1     2
#> 2 b         3     4
#> 3 c         5     6

Created on 2019-09-14 by the reprex package (v0.3.0)

The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider doesn't want to guess silently.

We can name them beforehand to avoid it :

df1 %>%
  dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%
  unnest_wider(values)
#> # A tibble: 3 x 3
#>   gr       V1    V2
#>   <chr> <int> <int>
#> 1 a         1     2
#> 2 b         3     4
#> 3 c         5     6

Or just use suppressMessages() or purrr::quietly()

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • The verbosity might change in future releases, if I forget to edit the answer when it happens please someone ping me here! – moodymudskipper Jan 18 '20 at 10:46
  • What if I have more than one column to unnest? – Laura Mar 20 '21 at 20:11
  • I had a similar problem. Each element of my list-column is a list of 3 numbers, and unnesting the column with unnest() would provide me only the first value. Why is unnest_wider needed? – JASC Jul 29 '21 at 02:29
  • Laura and @JASC check the [answer](https://stackoverflow.com/a/49690697/17055951) of Paweł Kozielski-Romaneczko. – rubengavidia0x Feb 08 '22 at 00:28
13

With data.table it's pretty simple:

library("data.table")
setDT(df1)
df1[, c("V1", "V2") := transpose(values)]
df1
#    gr values V1 V2
# 1:  a    1,2  1  2
# 2:  b    3,4  3  4
# 3:  c    5,6  5  6
Victorp
  • 13,636
  • 2
  • 51
  • 55
  • 4
    alternative for when the list-elements aren't of equal length: `df1[, rn := .I][, transpose(values), by = .(gr, rn)][]` – Jaap Apr 06 '18 at 10:07
12
library(tibble)

df1 <- data_frame(
  gr = c('a', 'b', 'c'),
  values = list(1:2, 3:4, 5:6)
)

library(tidyverse)

df1 %>%
  mutate(r = map(values, ~ data.frame(t(.)))) %>%
  unnest(r) %>%
  select(-values)

# # A tibble: 3 x 3
#   gr       X1    X2
#   <chr> <int> <int>
# 1 a         1     2
# 2 b         3     4
# 3 c         5     6
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
8

Maybe this:

cbind(df1[, "gr"], do.call(rbind, df1$values))
zx8754
  • 52,746
  • 12
  • 114
  • 209
1

Another one:

library(tibble)
library(dplyr)

df1 <- data_frame(
  gr = c('a', 'b', 'c'),
  values = list(1:2, 3:4, 5:6)
)

df %>% mutate(V1 = sapply(values, "[[", 1), V2 = sapply(values, "[[", 2))

# A tibble: 3 x 4
  gr    values       V1    V2
  <chr> <list>    <int> <int>
1 a     <int [2]>     1     2
2 b     <int [2]>     3     4
3 c     <int [2]>     5     6

Edit:

When the listed vectors are very long, and writing by hand V1 = sapply(values, "[[", index) is not convenient, then you can combine it with f_interp from lazyeval:

library(tibble)
library(dplyr)
library(lazyeval)
df <- data_frame(gr = c('a', 'b', 'c'), values = list(1:11, 3:13, 5:15))
nums <- c(1:11)
ll <- lapply(nums, function(nr) f_interp(~sapply(values, "[[", uq(nr))))
mutate_(df, .dots=setNames(ll, paste("V", nums, sep="")))

# A tibble: 3 x 12
  gr    values        V1    V2    V3    V4    V5    V6    V7    V8    V9   V10
  <chr> <list>     <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 a     <int [11]>     1     2     3     4     5     6     7     8     9    10
2 b     <int [11]>     3     4     5     6     7     8     9    10    11    12
3 c     <int [11]>     5     6     7     8     9    10    11    12    13    14
1

I have had a similar problem several times. My solution is admittedly clunky compared to the other answers, but reporting it for completeness sake.

library(tibble)
df1 <- data_frame(
  gr = c('a', 'b', 'c'),
  values = list(1:2, 3:4, 5:6)
)

matrix(unlist(df1[1])) -> grs
matrix(unlist(df1[2]), byrow=T, ncol=2) -> vals

Result:

> data.frame(grs, vals)
  grs X1 X2
1   a  1  2
2   b  3  4
3   c  5  6 
Axeman
  • 32,068
  • 8
  • 81
  • 94
Otto Kässi
  • 2,943
  • 1
  • 10
  • 27
  • Thanks, but we definitely don't want to convert to `matrix` and coerce everything into characters! – Axeman Apr 06 '18 at 12:42