My problem is similar to this question: Conditional merge/replacement in R
But I would like to merge by multiple colums, in my example by x1 and x2 (in real life with more than 2 columns).
Of note, those conditions result in unique combinations (unique rows)!
df1 <- data.frame(x1 = c(1,1,2,2), x2 = rep(letters[1:2],2), x3 = letters[1:4], stringsAsFactors = FALSE )
df2 <- data.frame(x1 = c(1,2,3), x2 = letters[1:3], x3 = c('xx','zz','yy'), stringsAsFactors = FALSE )
df1
x1 x2 x3
1 1 a a
2 1 b b
3 2 a c
4 2 b d
df2
x1 x2 x3
1 1 a xx
2 2 b zz
3 3 c yy
Desired output
x1 x2 x3 # 'merge/replaced' df1$x3 with df2$x3, by x1 & x2
1 1 a xx
2 1 b b
3 2 a c
4 2 b zz
The following works in general and gives the desired output for the example:
require(dplyr)
anti1 <- anti_join(df1, df2, by = c('x1','x2'))
semi2 <- semi_join(df2, df1, by = c('x1','x2'))
full_join(semi2,anti1)
x1 x2 x3
1 1 a xx
2 2 b zz
3 1 b b
4 2 a c # the row order is not important for me
But this doesn't seem really smart to me and also I do not know how this will behave when I have data frames with columns that are not part of the conditions.
I am sorry if this one should be a dup. If so, please hint me towards it. If you should have a better suggestion than my join approach: Thanks! I am open for many packages, but I am not really a data table guy...