Merging data frames in a list

Question

This is an offshoot of an earlier post that built a discussion around simplifying my function and eliminating the need for merging data frames that result from an lapply. Although tools such as dplyr and data.table eliminate the need for the merging, I'd still like to know how to merge in this situation. I have simplified the function that produces the list based on this answer to my previous question.

#Reproducible data
Data <- data.frame("custID" = c(1:10, 1:20),
    "v1" = rep(c("A", "B"), c(10,20)), 
    "v2" = c(30:21, 20:19, 1:3, 20:6), stringsAsFactors = TRUE)

#Split-Apply function
res <- lapply(split(Data, Data$v1), function(df) {
    cutoff <- quantile(df$v2, c(0.8, 0.9))
    top_pct <- ifelse(df$v2 > cutoff[2], 10, ifelse(df$v2 > cutoff[1], 20, NA))
    na.omit(data.frame(custID = df$custID, top_pct))
    })

This gives me the following results:

$A
  custID top_pct
1      1      10
2      2      20

$B
  custID top_pct
1      1      10
2      2      20
6      6      10
7      7      20

I would like the results to look like this:

  custID A_top_pct B_top_pct
1      1        10        10
2      2        20        20
3      6        NA        10
4      7        NA        20

What's the best way to get there? Should I be doing some sort of reshaping? If I do that, do I have to merge the data frames first?

Here's my solution, which may not be the best. (In the real application, there would be more than two data frames in the list.)

#Change the new variable name
names1 <- names(res)

for(i in 1:length(res)) {
    names(res[[i]])[2] <- paste0(names1[i], "_top_pct")
}

#Merge the results
res_m <- res[[1]]
for(i in 2:length(res)) {
    res_m <- merge(res_m, res[[i]], by = "custID", all = TRUE)
}

akrun · Accepted Answer · 2015-05-01T13:30:03.193

4

You can try Reduce with merge

 Reduce(function(...) merge(..., by='custID', all=TRUE), res)
 #     custID top_pct.x top_pct.y
 #1      1        10        10
 #2      2        20        20
 #3      6        NA        10
 #4      7        NA        20

Or as @Colonel Beauvel suggested, a more readable approach would be wrapping it with Curry from library(functional)

 library(functional)
 Reduce(Curry(merge, by='custID', all=T), res)

edited May 01 '15 at 13:30

answered May 01 '15 at 13:25

akrun

874,273
37
540
662

2

Maybe even more readable with `functional` package: `Reduce(Curry(merge, by='custID', all=T), res)` – Colonel Beauvel May 01 '15 at 13:27

Merging data frames in a list

1 Answers1

Linked