0

I’m trying to remove rows in one dataframe (df1) in which values from three columns match values from another dataframe consisting of those same three columns (df2). So for example:

df1=data.frame(id=c(1552, 1552, 2501, 2504, 2504, 2504), month=c(4, 6, 7, 3, 4, 4), year=c(1970, 1970, 1971, 1971, 1971, 1972), weight=c(135, 654, 164, 83, 155, 195), sex=c('F', 'F', 'M', 'F', 'F', 'F'))

df2= data.frame (id=c(1552, 2504), month=c(6, 4), year=c(1970, 1971))

In the end I would like this:

id month year weight sex
1 1552     4 1970    135   F
2 2501     7 1971    164   M
3 2504     3 1971     83   F
4 2504     4 1972    195   F

This question seems similar: Subset a data frame based on another but I’m unable to successfully implement the suggested solution in my problem. Does anyone know how to do this?

Community
  • 1
  • 1
Erica
  • 125
  • 1
  • 2
  • 9

1 Answers1

3

I think dplyr::anti_join will be helpful here

library(dplyr)
df1 <- data.frame(id = c(1552, 1552, 2501, 2504, 2504, 2504),
                  month = c(4, 6, 7, 3, 4, 4),
                  year = c(1970, 1970, 1971, 1971, 1971, 1972),
                  weight = c(135, 654, 164, 83, 155, 195),
                  sex = c('F', 'F', 'M', 'F', 'F', 'F'))
df2 <- data.frame(id = c(1552, 2504), month = c(6, 4), year = c(1970, 1971))
df1 %>% anti_join(df2)
## Joining by: c("id", "month", "year")
##     id month year weight sex
## 1 2504     4 1972    195   F
## 2 2504     3 1971     83   F
## 3 2501     7 1971    164   M
## 4 1552     4 1970    135   F
dickoa
  • 18,217
  • 3
  • 36
  • 50