I have 2 data.frames of unequal dimensions:
df1 <- data.frame(x = c(1,2,3),
y = c("foo", "bar","wow"))
df2 <- data.frame(x = c(2,2,5,7),
y = c("foo", "bar","wow","new"),
z = c("this","is","weird","huh"))
Where, in df1
, x
values are correlated to y
values.
I need to know how to compare df2
to df1
to verify that variables x
and y
are correct: values in df2$x
and df2$y
must match those of df1
.
How can I get the rows of df2
that show values that are either not in df1
or that do not match the correlation between x
and y
?
So, from these two dataframes:
# df1
# x y
# 1 foo
# 2 bar
# 3 wow
# df2
# x y z
# 2 foo this
# 2 bar is
# 5 wow weird
# 7 new huh
I would like to get a result like:
# badRows
# x y z
# 2 foo this
# 5 wow weird
# 7 new huh
I've tried using identical()
and compare()
with no luck.
Update: Provisional answer was found here (however using the multiple keys at the moment was not clear until after this question was posted): Find complement of a data frame (anti - join)