Fund ID comparisons across dataframes in R

Question

I am compiling a report where I compare fund IDs to find any discrepancies between (company name, ID) pairings across dataframes. Tables below:

Company	ID
A	123
B	456
C	789

and

Company	ID
B	453
A	123
C	789

In this case, ideally there is some function that reads each company name, identifies each corresponding ID, and then finds the same company name with corresponding ID from the next table. If they are the same value, I assume it would return TRUE and if different, return FALSE.

In the example above, it would return TRUE, FALSE, TRUE since the only discrepancy is with company B.

I'm thinking it must be some nested variation of sapply but for the life of me I can't figure out what the syntax would be (since the companies will be on different rows).

A near duplicate of [Is there an R function for finding the index of an element in a vector?](https://stackoverflow.com/questions/5577727/is-there-an-r-function-for-finding-the-index-of-an-element-in-a-vector). `paste(df1$company, df1$ID) %in% paste(df2$company, df2$ID)` should do the work. — ekoam, Jan 09 '22 at 02:53
Your wording is vague. Are you guaranteed to only ever have 2 tables, or do you need to accomodate more? — camille, Jan 09 '22 at 20:47

score 3 · Answer 1 · answered Jan 09 '22 at 05:12

I think a join with a comparison works well here.

library(dplyr)

df_x<-tibble::tribble(
       ~Company,  ~ID,
            "A", 123L,
            "B", 456L,
            "C", 789L
       )

df_y<-tibble::tribble(
       ~Company,  ~ID,
            "B", 453L,
            "A", 123L,
            "C", 789L
       )

df_joined<-df_x %>%
  left_join(df_y, by="Company") %>%
  mutate(discrepancy=(ID.x!=ID.y))

df_joined

#> # A tibble: 3 × 4
#>   Company  ID.x  ID.y discrepancy
#>   <chr>   <int> <int> <lgl>       
#> 1 A         123   123 FALSE       
#> 2 B         456   453 TRUE        
#> 3 C         789   789 FALSE

^{Created on 2022-01-09 by the reprex package (v2.0.1)}

This is good at this scale, but won't work if there are more than 2 tables — camille, Jan 09 '22 at 20:46

Fund ID comparisons across dataframes in R

1 Answers1