Select rows based in rows of another data.frame

Question

I have these following data.frames:

dt1

Id  Mother Weight  
1    elly     10
2    bina     20
3    sirce    30
4    tina     30
5    lina     40

and

dt2

Id   Mother  Weight  sex  
1    elly     10      M
2    bina     20      F
3    sirce    30      F

And I would like select rows from DT1 (ID) based in DT2 (ID), this way:

new.dt

Id   Mother  Weight  sex
4    tina     30     NA
5    lina     40     NA

Possible duplicate of [Compare two data.frames to find the rows in data.frame 1 that are not present in data.frame 2](https://stackoverflow.com/questions/3171426/compare-two-data-frames-to-find-the-rows-in-data-frame-1-that-are-not-present-in) — andrew_reece, May 03 '19 at 21:57
@akrun one answer is `anti_join(a1,a2)` - it's not _exactly_ the same, but it's pretty close and OP hasn't indicated any effort to search through the many similar answers on SO before posting. there's plenty on SO on this topic. — andrew_reece, May 03 '19 at 22:02
@akrun, agreed! it does seem to be somewhat of a code-specific cultural point on SO. for example, i see way more dupe-eligible questions tagged as dupe in the python threads than in the r threads. not sure why that is. anyway, i do think this question is a close duplicate of several questions on the site. — andrew_reece, May 03 '19 at 22:06
@andrew_reece Sorry, but the previous question was very very confuse. — Curious G., May 03 '19 at 22:09
Possible duplicate of [Find complement of a data frame (anti - join)](https://stackoverflow.com/questions/28702960/find-complement-of-a-data-frame-anti-join) — camille, May 03 '19 at 22:53
@andrew_reece: There doesn't appear to be any serious interest in keeping the SO collection free of dupes. There's no serious penalty for failing to search and there is a penalty for downvoting and no rep for finding dupes. — IRTFM, May 04 '19 at 04:39

score 1 · Accepted Answer · answered May 03 '19 at 21:53

Here is one option with anti_join

library(dplyr)
anti_join(dt1 %>% 
           mutate(sex = NA), dt2, by = 'Id')
#   Id Mother Weight sex
#1  4   tina     30  NA
#2  5   lina     40  NA

data

dt1 <- structure(list(Id = 1:5, Mother = c("elly", "bina", "sirce", 
"tina", "lina"), Weight = c(10L, 20L, 30L, 30L, 40L)), 
   class = "data.frame", row.names = c(NA, 
-5L))


dt2 <- structure(list(Id = 1:3, Mother = c("elly", "bina", "sirce"), 
    Weight = c(10L, 20L, 30L), sex = c("M", "F", "F")), 
  class = "data.frame", row.names = c(NA, 
-3L))

user11450027 · Answer 2 · 2019-05-03T22:21:53.880

1

transform(dt1[!dt1$Id %in% dt2$Id,], sex = NA)
#  Id Mother Weight sex
#4  4   tina     30  NA
#5  5   lina     40  NA

d = merge(dt1, dt2, all = TRUE)
d[is.na(d$sex),]
#  Id Mother Weight  sex
#4  4   tina     30 <NA>
#5  5   lina     40 <NA>

edited May 03 '19 at 22:21

answered May 03 '19 at 22:10

user11450027

66
2

1

Could you add some explanation of what your code does? – camille May 03 '19 at 23:36

Select rows based in rows of another data.frame

2 Answers2

data