cbind equally named vectors in multiple data.frames in a list to a single data.frame

Question

I have a list similar to this one:

set.seed(1602)
l <- list(data.frame(subst_name = sample(LETTERS[1:10]), perc = runif(10), crop = rep("type1", 10)),
      data.frame(subst_name = sample(LETTERS[1:7]), perc = runif(7), crop = rep("type2", 7)),
      data.frame(subst_name = sample(LETTERS[1:4]), perc = runif(4), crop = rep("type3", 4)),
      NULL,
      data.frame(subst_name = sample(LETTERS[1:9]), perc = runif(9), crop = rep("type5", 9)))

Question: How can I extract the subst_name-column of each data.frame and combine them with cbind() (or similar functions) to a new data.frame without messing up the order of each column? Additionally the columns should be named after the corresponding crop type (this is possible 'cause the crop types are unique for each data.frame)

EDIT: The output should look as follows:

Having read the comments I'm aware that within R it doesn't make much sense but for the sake of having alook at the output the data.frame's View option is quite handy.

The question as it stands doesn't really make sense. What should the result look like? — Ista, Dec 16 '16 at 20:05
Please illustrate with data your expected result. To use cbind, number of rows must be the same or multiples of each other (i.e., 20 and 10 obs works but not 10 and 17 obs or 20 and 23 obs). — Parfait, Dec 16 '16 at 21:19
I added the desired result. And I reduced the length of the vectors for readability. — andschar, Dec 17 '16 at 09:56

nestor556 · Answer 1 · 2016-12-19T17:02:18.010

It is not really correct to do this with the given example because the number of rows is not the same in each one of the list's data frames . But if you don't care you can do:

nullElements = unlist(sapply(l,is.null))
l = l[!nullElements] #delete useless null elements in list
columns=lapply(l,function(x) return(as.character(x$subst_name)))
newDf = as.data.frame(Reduce(cbind,columns))

If you don't want recycled elements in the columns you can do

for(i in 1:ncol(newDf)){
  colLength = nrow(l[[i]])
  newDf[(colLength+1):nrow(newDf),i] = NA
}
newDf = newDf[1:max(unlist(sapply(l,nrow))),] #remove possible extra NA rows

Note that I edited my previous code to remove NULL entries from l to simplify things

this doesn't lead to the desired output as the vectors are recycled (once they reach there last row) whereas they should be filled up with NAs. Sorry was missing in the question before the edit. — andschar, Dec 17 '16 at 09:54

score 0 · Accepted Answer · edited May 23 '17 at 10:30

With the help of this SO-Question I came up with the following sollution. (There's probably room for improvement)

a <- lapply(l, '[[', 1) # extract the first element of the dfs in the list
a <- Filter(function(x) !is.null(unlist(x)), a) # remove NULLs
a <- lapply(a, as.character)
max.length <- max(sapply(a, length))
## Add NA values to list elements
b <- lapply(a, function(v) { c(v, rep(NA, max.length-length(v)))})
e <- as.data.frame(do.call(cbind, d))
names(e) <- unlist(lapply(lapply(lapply(l, '[[', "crop"), '[[', 2), as.character))

cbind equally named vectors in multiple data.frames in a list to a single data.frame

2 Answers2