1

I am trying to merge two dataframes, lets say A and B, of unequal number of rows and columns. They have two common columns, c1 and c2. I know for a fact that every c1 x c2 combination that exists in one dataframe also exists in the other, so I'm not expecting any null values. A has many instances of each c1 x c2 combination that exists, whereas B has exactly one instance of each c1 x c2 combination. rbind and merge haven't been working.

If the two data frames are:

  c1 c2 c3 c4
1  A  1  5  1
2  B  2  4  2
3  B  1  3  4
4  A  2  3  4
5  A  1  3  3
6  B  2  1  8

and

  c1 c2 c5
1  A  1  5
2  B  2  4
3  B  1  3
4  A  2  8

then I want to somehow bind with c1 and c2 and produce:

  c1 c2 c3 c4 c5
1  A  1  5  1  5
2  B  2  4  2  4
3  B  1  3  4  3
4  A  2  3  4  8
5  A  1  3  3  5
6  B  2  1  8  4
slothish1
  • 119
  • 1
  • 11
  • 1
    `merge(A, B, by=c("c1","c2"))` should work – SymbolixAU Mar 23 '16 at 03:40
  • Or `dplyr::full_join(A, B)`, which in this case is not particularly better. In both cases, you can specify the columns to join `by` if you like, but it'll join by the intersection of the column names, anyway, which is what you want here. – alistaire Mar 23 '16 at 03:43

2 Answers2

3
A <- data.frame(c1=c('A','B','B','A','A','B'),c2=c(1,2,1,2,1,2),c3=c(5,4,3,3,3,1),c4=c(1,2,4,4,3,8),stringsAsFactors=F);
B <- data.frame(c1=c('A','B','B','A'),c2=c(1,2,1,2),c5=c(5,4,3,8),stringsAsFactors=F);
merge(A,B,c('c1','c2'));
##   c1 c2 c3 c4 c5
## 1  A  1  5  1  5
## 2  A  1  3  3  5
## 3  A  2  3  4  8
## 4  B  1  3  4  3
## 5  B  2  4  2  4
## 6  B  2  1  8  4
bgoldst
  • 34,190
  • 6
  • 38
  • 64
0

We could also use join from data.table with on

library(data.table)
setDT(df1)[df2, on= c("c1", "c2")][order(c1)]
#   c1 c2 c3 c4 c5
#1:  A  1  5  1  5
#2:  A  1  3  3  5
#3:  A  2  3  4  8
#4:  B  2  4  2  4
#5:  B  2  1  8  4
#6:  B  1  3  4  3
akrun
  • 874,273
  • 37
  • 540
  • 662