1

I have these following data.frames:

dt1

Id  Mother Weight  
1    elly     10
2    bina     20
3    sirce    30
4    tina     30
5    lina     40

and

dt2

Id   Mother  Weight  sex  
1    elly     10      M
2    bina     20      F
3    sirce    30      F

And I would like select rows from DT1 (ID) based in DT2 (ID), this way:

new.dt

Id   Mother  Weight  sex
4    tina     30     NA
5    lina     40     NA
Curious G.
  • 838
  • 8
  • 23
  • Possible duplicate of [Compare two data.frames to find the rows in data.frame 1 that are not present in data.frame 2](https://stackoverflow.com/questions/3171426/compare-two-data-frames-to-find-the-rows-in-data-frame-1-that-are-not-present-in) – andrew_reece May 03 '19 at 21:57
  • @akrun one answer is `anti_join(a1,a2)` - it's not _exactly_ the same, but it's pretty close and OP hasn't indicated any effort to search through the many similar answers on SO before posting. there's plenty on SO on this topic. – andrew_reece May 03 '19 at 22:02
  • 1
    @akrun, agreed! it does seem to be somewhat of a code-specific cultural point on SO. for example, i see way more dupe-eligible questions tagged as dupe in the python threads than in the r threads. not sure why that is. anyway, i do think this question is a close duplicate of several questions on the site. – andrew_reece May 03 '19 at 22:06
  • @andrew_reece Sorry, but the previous question was very very confuse. – Curious G. May 03 '19 at 22:09
  • 1
    Possible duplicate of [Find complement of a data frame (anti - join)](https://stackoverflow.com/questions/28702960/find-complement-of-a-data-frame-anti-join) – camille May 03 '19 at 22:53
  • @andrew_reece: There doesn't appear to be any serious interest in keeping the SO collection free of dupes. There's no serious penalty for failing to search and there is a penalty for downvoting and no rep for finding dupes. – IRTFM May 04 '19 at 04:39

2 Answers2

1

Here is one option with anti_join

library(dplyr)
anti_join(dt1 %>% 
           mutate(sex = NA), dt2, by = 'Id')
#   Id Mother Weight sex
#1  4   tina     30  NA
#2  5   lina     40  NA

data

dt1 <- structure(list(Id = 1:5, Mother = c("elly", "bina", "sirce", 
"tina", "lina"), Weight = c(10L, 20L, 30L, 30L, 40L)), 
   class = "data.frame", row.names = c(NA, 
-5L))


dt2 <- structure(list(Id = 1:3, Mother = c("elly", "bina", "sirce"), 
    Weight = c(10L, 20L, 30L), sex = c("M", "F", "F")), 
  class = "data.frame", row.names = c(NA, 
-3L))
akrun
  • 874,273
  • 37
  • 540
  • 662
1
transform(dt1[!dt1$Id %in% dt2$Id,], sex = NA)
#  Id Mother Weight sex
#4  4   tina     30  NA
#5  5   lina     40  NA

d = merge(dt1, dt2, all = TRUE)
d[is.na(d$sex),]
#  Id Mother Weight  sex
#4  4   tina     30 <NA>
#5  5   lina     40 <NA>