Extract rows from second Dataframe which are newly added compare to first Dataframe

Question

I have two data frames, I need to find the rows in second data frame which are newly added that means my First data frame has some rows and my second data frame can have few rows from my First data frame and some other rows also. I need to find those rows which are not in first data frame. That means rows which are only in my second data frame.

Below is the example with output

comp1<- data.frame(sector =c('Sector_123','Sector_456','Sector_789','Sector_101','Sector_111','Sector_113','Sector_115','Sector_117'), id=c(1,2,3,4,5,6,7,8) ,stringsAsFactors = FALSE)

comp2 <- data.frame(sector = c('Sector_456','Sector_789','Sector_000','Sector_222'), id=c(2,3,6,5),  stringsAsFactors = FALSE)

Expected output is should be like below:

sector             id
Sector_000          6
Sector_222          5

I should not use any other libraries like compare and data.table. any suggestions

Does this answer your question? [Find complement of a data frame (anti - join)](https://stackoverflow.com/questions/28702960/find-complement-of-a-data-frame-anti-join) — Paul, Jun 22 '20 at 12:24

Martin Gal · Answer 1 · 2020-06-22T12:24:54.623

0

Assuming we are looking for similar entries in column sector. For all columns just remove the restriction.

We could use dplyr:

anti_join(comp2, comp1, by="sector")

gives us

> anti_join(comp2, comp1, by="sector")
      sector id
1 Sector_000  6
2 Sector_222  5

With base R we could use

comp2[!comp2$sector %in% comp1$sector,]

edited Jun 22 '20 at 12:24

answered Jun 22 '20 at 12:19

Martin Gal

16,640
5
21
39

getting below result: sector id Sector_000 3 Sector_222 4 Sector_000 7 Sector_222 8 not as expected, Id is wrong – Hmm Jun 22 '20 at 13:09
What did you use? – Martin Gal Jun 22 '20 at 13:13
both statements are giving same results for me. tried both – Hmm Jun 22 '20 at 13:15
Are you sure you used the data shown in your question? I get the result shown above. – Martin Gal Jun 22 '20 at 13:22

Extract rows from second Dataframe which are newly added compare to first Dataframe

1 Answers1