75

I have a (fairly long) list of vectors. The vectors consist of Russian words that I got by using the strsplit() function on sentences.

The following is what head() returns:

[[1]]
[1] "модно"     "создавать" "резюме"    "в"         "виде"     

[[2]]
[1] "ты"        "начианешь" "работать"  "с"         "этими"    

[[3]]
[1] "модно"            "называть"         "блогер-рилейшенз" "―"                "начинается"       "задолго"         

[[4]]
[1] "видел" "по"    "сыну," "что"   "он"   

[[5]]
[1] "четырнадцать," "я"             "поселился"     "на"            "улице"        

[[6]]
[1] "широко"     "продолжали" "род."

Note the vectors are of different length.

What I want is to be able to read the first words from each sentence, the second word, the third, etc.

The desired result would be something like this:

    P1              P2           P3                 P4    P5           P6
[1] "модно"         "создавать"  "резюме"           "в"   "виде"       NA
[2] "ты"            "начианешь"  "работать"         "с"   "этими"      NA
[3] "модно"         "называть"   "блогер-рилейшенз" "―"   "начинается" "задолго"         
[4] "видел"         "по"         "сыну,"            "что" "он"         NA
[5] "четырнадцать," "я"          "поселился"        "на"  "улице"      NA
[6] "широко"        "продолжали" "род."             NA    NA           NA

I have tried to just use data.frame() but that didn't work because the rows are of different length. I also tried rbind.fill() from the plyr package, but that function can only process matrices.

I found some other questions here (that's where I got the plyr help from), but those were all about combining for instance two data frames of different size.

Thanks for your help.

Ico
  • 753
  • 1
  • 6
  • 5

7 Answers7

107

One liner with plyr

plyr::ldply(word.list, rbind)
Ramnath
  • 54,439
  • 16
  • 125
  • 152
59

try this:

word.list <- list(letters[1:4], letters[1:5], letters[1:2], letters[1:6])
n.obs <- sapply(word.list, length)
seq.max <- seq_len(max(n.obs))
mat <- t(sapply(word.list, "[", i = seq.max))

the trick is, that,

c(1:2)[1:4]

returns the vector + two NAs

adibender
  • 7,288
  • 3
  • 37
  • 41
  • 13
    this could be further condensed to one line by: `sapply(word.list, '[', seq(max(sapply(word.list, length))))` (as shown [**here**](http://stackoverflow.com/questions/5531471/combining-unequal-columns-in-r)) – Arun Mar 04 '13 at 12:40
  • 5
    For those who would use @Arun's one-line solution, note that there must be a transpose `t()` to create the appropriate columns, as in the original question. – Ashe Mar 28 '17 at 21:12
  • @adibender Awesome solution. Could you explain the function "[" ? – Ahmed Abdullah Dec 25 '22 at 23:03
24

Another option is stri_list2matrix from library(stringi)

library(stringi)
stri_list2matrix(l, byrow=TRUE)
#    [,1] [,2] [,3] [,4]
#[1,] "a"  "b"  "c"  NA  
#[2,] "a2" "b2" NA   NA  
#[3,] "a3" "b3" "c3" "d3"

NOTE: Data from @juba's post.

Or as @Valentin mentioned in the comments

sapply(l, "length<-", max(lengths(l)))

Or using tidyverse

library(purrr)
library(tidyr)
library(dplyr)
tibble(V = l) %>% 
   unnest_wider(V, names_sep = "")
# A tibble: 3 × 4
  V1    V2    V3    V4   
  <chr> <chr> <chr> <chr>
1 a     b     c     <NA> 
2 a2    b2    <NA>  <NA> 
3 a3    b3    c3    d3   
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 4
    I think your elegant base R solution given [here](https://stackoverflow.com/questions/33613337/the-simplest-way-to-convert-a-list-with-various-length-vectors-to-a-data-frame-i#answer-33622855) is worth being mentioned as well: `sapply(l, "length<-", max(lengths(l)))` – Valentin_Ștefan Jan 26 '18 at 21:47
  • if i have list and inside list how to do this? – PesKchan Jan 31 '21 at 11:36
  • 1
    @PesKchan For that you may need a nested loop i.e. `lapply(l, function(subl) lapply(subl, "length<-", max(lengths(subl))))` – akrun Jan 31 '21 at 12:59
  • https://stackoverflow.com/questions/65978952/convert-list-of-different-length-into-data-table-for-markdown-for-html-format . I would request you to have a look at this. My idea was to change the list to dataframe then to data table. Is there way directly to data table? – PesKchan Jan 31 '21 at 13:27
  • Sir @akrun im definitely not one of them and many of my phd biology datanalysis code is taken from your answers from various question you have answered.. – PesKchan Jan 31 '21 at 13:30
  • @akrun do you know any `dplyr` equivalent? – Álvaro A. Gutiérrez-Vargas Aug 05 '22 at 09:00
  • 1
    @ÁlvaroA.Gutiérrez-Vargas updated with tidyverse – akrun Aug 05 '22 at 14:46
18

You can do something like this :

## Example data
l <- list(c("a","b","c"), c("a2","b2"), c("a3","b3","c3","d3"))
## Compute maximum length
max.length <- max(sapply(l, length))
## Add NA values to list elements
l <- lapply(l, function(v) { c(v, rep(NA, max.length-length(v)))})
## Rbind
do.call(rbind, l)

Which gives :

     [,1] [,2] [,3] [,4]
[1,] "a"  "b"  "c"  NA  
[2,] "a2" "b2" NA   NA  
[3,] "a3" "b3" "c3" "d3"
juba
  • 47,631
  • 14
  • 113
  • 118
  • Aha -- what we forgot (Juba and me) is that you don't need to "fill in" the original list elements with `NA` values. The `sapply` snippet I put in a comment returns `NA` for list elements which are shorter than the requested index value. Ain't it nice of `sapply` not to crash? :-) – Carl Witthoft Mar 04 '13 at 15:33
  • 1
    instead of `max(sapply(l, length))`, you can also use the wrapper `lengths`- `max(lengths(l))` – tjebo Jan 04 '22 at 21:50
8

You could also use rbindlist() from the data.table package.

Convert vectors to data.tables or data.frames and transpose them (not sure if this reduces speed a lot) with the help of lapply(). Then bind them with rbindlist() - filling missing cells with NA.

require(data.table)

l = list(c("a","b","c"), c("a2","b2"), c("a3","b3","c3","d3"))
dt = rbindlist(
  lapply(l, function(x) data.table(t(x))),
  fill = TRUE
)
andschar
  • 3,504
  • 2
  • 27
  • 35
0

As the question was to convert a list to a data.frame, you can bring all list vectors to the maximum length max(lengths(L)) with length<- used in lapply and the use list2DF to convert this list to a data.frame.

L <- list(a=1, b=2:3, c=3:5)

list2DF(lapply(L, `length<-`, max(lengths(L))))
#   a  b c
#1  1  2 3
#2 NA  3 4
#3 NA NA 5
GKi
  • 37,245
  • 2
  • 26
  • 48
-1

Another option could be to define a function like this (it'd mimic rbind.fill) or use it directly from rowr package:

cbind.fill <- function(...){
  nm <- list(...) 
  nm <- lapply(nm, as.matrix)
  n <- max(sapply(nm, nrow)) 
  do.call(cbind, lapply(nm, function (x) 
    rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

This response is taken from here (and there're some usage examples).

jgarces
  • 519
  • 5
  • 17
  • Not sure, how this works, you don't provide an example. Seems like an exact copy from here: https://stackoverflow.com/questions/7962267/cbind-a-dataframe-with-an-empty-dataframe-cbind-fill Also: rowr is not on CRAN anymore. – andschar Aug 25 '20 at 15:02