Merging two lists of dataframes using R

Question

I would like to merge two lists of dataframes according to a common id variable, consider the following example

set.seed(1)
mylist1=data.frame(id=sample(paste0("id",sample(1:5,10,T))),var1=sample(letters[1:26],10,T),stringsAsFactors=F);mylist1=split(mylist1,mylist1$id)
set.seed(2)
mylist2=data.frame(id=sample(paste0("id",sample(1:5,10,T))),var2=sample(LETTERS[1:26],10,T),stringsAsFactors=F);mylist2=split(mylist2,mylist2$id)

mylist1
# $id1
# id     var1
# id1    d
# 
# $id2
# id     var1
# id2    f
# id2    g
# id2    w
# etc.

mylist2
# $id1
# id     var2
# id1    V
# id1    D
# id1    J
# 
# $id3
# id     var2
# id3    K
# id3    J
# id3    Z
# etc.

The resulting list of dataframes should look like

# $id1
# id  var1 var2
# id1 d    V
# id1 d    D
# id1 d    J

# $id2
# id  var1 var2
# id2 f    NA
# id2 g    NA
# id2 w    NA
# etc.

Do yo know how I could do this?

possible duplicate of [Simultaneously merge multiple data.frames in a list](http://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list) — jeremycg, Aug 13 '15 at 12:52
In this case you could marge data.frame's and then split the resulting one into a list — Andriy T., Aug 13 '15 at 12:53
Try `Map(merge, mylist1, mylist2,MoreArgs=list(by='id', all=TRUE))` — akrun, Aug 13 '15 at 12:55
In the example, the lengths of mylist1 and mylist2 are different ie. 5 vs. 4 — akrun, Aug 13 '15 at 13:02
@jeremycg. This is not the same question as the output should be a list of dataframes and not a data frame. akrun, I corrected the code, sorry for the typo — goclem, Aug 13 '15 at 13:03
@Clement In the examples, I find some ids are not present in mylist1 , which is found in mylist2. How do you want to deal those cases — akrun, Aug 13 '15 at 13:05
@akrun, this is the case of `id2` which is not present in `mylist2`. In this case, the resulting dataframe (in the list) should take `NA` values for `id2`. i.e. example for an illustration — goclem, Aug 13 '15 at 13:08

akrun · Accepted Answer · 2015-08-13T13:25:18.597

We can use Map to do this. From the example dataset, it is clear that only some list elements are common to both (based on the names of the list elements).

Our first step would be to get all the unique names in each of the list using union. We subset the first ('lst1') and second list ('lst2') with those names ('nm1'). If there is a missing element, it will be a NULL element for that position.

nm1 <- union(names(mylist1), names(mylist2))
lst1 <- mylist1[nm1]
lst2 <- mylist2[nm1]

Now, we change the NULL values in each list by creating a 'data.frame' for that position. We can use if/else to do this on a lapply loop.

lst1 <- lapply(lst1, function(x) if(is.null(x)) 
                         data.frame(id=NA, var1=NA) else x)
lst2 <- lapply(lst2, function(x) if(is.null(x))
                        data.frame(id=NA, var2=NA) else x)

After that, we can merge the two lists using Map. The corresponding elements of the lists are merged. Instead of using anonymous function, we can make use of MoreArgs to specify the extra arguments that may be needed for the merge.

Map(merge, lst1, lst2,MoreArgs=list(by='id', all=TRUE))
#$id1
#   id var1 var2
#1 id1    d    V
#2 id1    d    D
#3 id1    d    J

#$id2
#    id var1 var2
#1  id2    f   NA
#2  id2    g   NA
#3  id2    w   NA
#4 <NA> <NA>   NA

#$id3
#   id var1 var2
#1 id3    y    K
#2 id3    y    J
#3 id3    y    Z

#$id4
#   id var1 var2
#1 id4    a    D
#2 id4    i    D

#$id5
#   id var1 var2
#1 id5    q    R
#2 id5    q    M
#3 id5    q    D
#4 id5    k    R
#5 id5    k    M
#6 id5    k    D
#7 id5    j    R
#8 id5    j    M
#9 id5    j    D

Merging two lists of dataframes using R

1 Answers1