For example, df1 looks like below -
X1 X2 X3 X4 X5
Apple Belgium Red Purchase 100
Guava Germany Green Sale 200
Grape Italy Purple Purchase 500
Orange India Orange Sale 2000
df2 looks like below -
X1 X2 X3 X4 X5
Apple Belgium Red Purchase 10000
Guava Germany Green Sale 20000
Grape Italy Purple Purchase
Orange India Orange Sale 2000
My output should look like -
X1 X2 X3 X4 X5.x X5.y
Apple Belgium Red Purchase 100 10000
Guava Germany Green Sale 200 20000
Grape Italy Purple Purchase 500 NA
Here multiple operations are involved -
Pick the rows present in 1 and not in other, vice versa
Pick the mismatches in X5 column (X5 is my target column) when the first 4 column matches
I do not want the matches.
I tried a combination of inner_join, full_join and anti_join of both to obtain the part1. How do I perform the second part? Is there a conditional join available in R that picks only the mismatches and ignores when the target column is same?
I don't want to use sqldf. I know this can be achieved in SQL. I want to do this in dplyr. Any help is much appreciated.
TIA.