0

I have 2 data frames with account numbers and amounts plus some other irrelevant columns. I would like to compare the output with a Y or N if they match or not.

I need to compare the account number in row 1 in dataframe A to the account number in row 1 in dataframe B and if they match put a Y in a column or an N if they don't. I've managed to get the code to check if there is a match in the entire dataframe but I need to check each row individually.

E.g. df1
|account.num|x1|x2|x3|
|100|a|b|c|
|101|a|b|c|
|102|a|b|c|
|103|a|b|c|

df2
|account.num|x1|x2|x3|
|100|a|b|c|
|102|a|b|c|
|101|a|b|c|
|103|a|b|c|

output
|account.num|x1|x2|x3|match|
|100|a|b|c|Y|
|101|a|b|c|N|
|102|a|b|c|N|
|103|a|b|c|Y|

So, row 1 matches as they have the same account number, but row 2 doesn't because they are different. However, the other data in the dataframe doesn't matter just that column. Can I do this without merging the data frames? (I did have tables, but they won't work. I don't know why. So sorry if that's hard to follow).

AndrewGB
  • 16,126
  • 5
  • 18
  • 49
penguiner
  • 3
  • 2
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Sounds like you need to [merge your data](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right?rq=1) – MrFlick Jul 06 '21 at 06:15
  • `comparedf` is the function you are looking for. https://cran.r-project.org/web/packages/arsenal/vignettes/comparedf.html – Koundy Jul 06 '21 at 06:22

3 Answers3

0

If you want a base R solution, here is a quick sketch. Assuming boath dataframes are of the same length (number of rows), it should work with your data.

# example dataframes
a <- data.frame(A=c(1,2,3), B=c("one","two","three"))
b <- data.frame(A=c(3,2,1), B=c("three","two","one"))


res <- c() #initialise empty result vector

for (rownum in c(1:nrow(a))) {
  # iterate over all numbers of rows
  res[rownum]  <- all(a[rownum,]==b[rownum,])
}

res # result vector
# [1] FALSE  TRUE FALSE

# you can put it in frame a like this. example colname is "equalB"
a$equalB <- res
Sandwichnick
  • 1,379
  • 6
  • 13
0

You can use == to compare if account.num is equal, and use this boolean vector to subset c("N", "Y")

df1$match <- c("N", "Y")[1 + (df1[[1]] == df2[[1]])]

df1
#  account.num x1 x2 x3 match
#1         100  a  b  c     Y
#2         101  a  b  c     N
#3         102  a  b  c     N
#4         103  a  b  c     Y

Data:

df1 <- data.frame(account.num=100:103, x1="a", x2="b", x3="c")
df2 <- data.frame(account.num=c(100,102,101,103), x1="a", x2="b", x3="c")
GKi
  • 37,245
  • 2
  • 26
  • 48
0

If you want a tidyverse solution, you can use left_join.

The principle here would be to try to match the data from df2 to the data from df1. If it matches, it would add TRUE to a match column. Then, the code replace the NA values with FALSE.

I'm also adding code to create the data frames from the exemple.

library(tidyverse)

df1 <- 
  tribble(~account_num, ~x1, ~x2, ~x3,
                   100, "a", "b", "c",
                   101, "a", "b", "c",
                   102, "a", "b", "c",
                   103, "a", "b", "c") %>%
  rowid_to_column() # because position in the df is an important information, 
                    # I need to hardcode it in the df

df2 <- 
  tribble(~account_num, ~x1, ~x2, ~x3,
                   100, "a", "b", "c",
                   102, "a", "b", "c",
                   101, "a", "b", "c",
                   103, "a", "b", "c") %>%
  rowid_to_column()

# take a
df1 %>% 
  # try to match df1 with version of df2 with a new column where `match` = TRUE
  # according to `rowid`, `account_num`, `x1`, `x2`, and `x3`
  left_join(df2 %>% 
              tibble::add_column(match = TRUE),
            by = c("rowid", "account_num", "x1", "x2", "x3")
  ) %>% 
  # replace the NA in `match` with FALSE in the df
  replace_na(list(match = FALSE))
Cedric
  • 71
  • 1
  • 5