There are 2 datasets
s=structure(list(var1 = c("a", "f", "k", "tt", "ee"), var2 = c("b",
"g", "l", "qq", "rr"), var3 = c("c", "h", "m", "ff", "cc"), var4 = c("d",
"i", "n", "gg", "vv"), var5 = c("e", "j", "o", "aa", "xx"), metric_var = c(100L,
200L, 300L, 567L, 789L)), class = "data.frame", row.names = c(NA,
-5L))
and
medagger=structure(list(var1 = c("a", "z", "w", "f", "k"), var2 = c("b",
"u", "e", "g", "l"), var3 = c("c", "p", "r", "h", "m"), var4 = c("d",
"q", "q", "i", "n"), var5 = c("e", "n", "w", "j", "o"), metric_var = c(100L,
200L, 400L, 500L, 700L)), class = "data.frame", row.names = c(NA,
-5L))
var1-var5 are categorical variables, their values are categories.
if we do an inner join s
and a medagger
we get this result
merge(s,medagger,by=c("var1","var2","var3","var4","var5"))
var1 var2 var3 var4 var5 metric_var.x metric_var.y
1 a b c d e 100 100
2 f g h i j 200 500
3 k l m n o 300 700
only 3 rows to return, because in these datasets there are the same categories.
However, I need the categories that there are in the s
dataset but not in the medagger
dataset
were put in separate dataframes.
In this case, I need the dataframe new
to appear, which will contain only 2 of these rows from s
.
var1 var2 var3 var4 var5 metric_var
tt qq ff gg aa 567
ee rr cc vv xx 789
How to do that the categories that were in s
but not in the medagger
appear in the new
dataframe ?
Thank you.