I am looking to filter out observations within my data based on certain values by group, which is based on a separate table. I am also trying to work exclusively with dplyr
whereas I've performed tasks like these with data.table
and I'm not sure how to accomplish it at all.
Here is some sample data to illustrate:
#Primary dataset
dat <- data.frame(account = c(1, 3, 3, 3, 5, 5, 7),
ip = c("255.255.255",
"255.255.255", "199.199.99", "255.255.255",
"75.75.75", "120.120.120",
"50.50.50"),
value = c(50, 1000, 800, 2500, 3000, 500, 75))
From the dataset, I would like to filter based on a list of IPs per account, which is another table:
#Filtering reference table
exclude <- data.frame(account = c(3, 5),
ip = c("255.255.255", "120.120.120"))
The desired output of dat
after filtering would be:
account ip value
1 1 255.255.255 50
2 3 199.199.99 800
3 5 75.75.75 3000
4 7 50.50.50 75
I am specifically unsure how to include the reference in a group_by
within a piped (%>%
) series of dplyr
verbs on dat
. I also may be approaching the task incorrectly given I am still familiarizing with the dplyr
style of programming, so am open to a different way than the reference approach I am considering as long as it is within dplyr
.