Say I have the data frame DF
that I want to subset based on the data frame DF_select
library(dplyr)
set.seed(1)
DF <- data.frame(A = rep( LETTERS[1:4], each = 3),
B = rep( c(1:3), 4),
C = c("some", "thing"))
DF_select <- sample_n(DF, 5)
using row-indices won't work in the real example because the data frame DF has multiple rows for each matching row in DF_select
.
filter
with A %in% DF_select$A & B %in% ...
won't work either as it will also match combinations that are not in the DF_select
data.frame.
I can solve it by creating a temporary variable as unique row identifier in both data frames like this
mutate(DF, temp_var = paste(A,B,C, sep = "_"))
but I was wondering if there is a more elegant solution?