-1

I have two data frames to compare. Screenshots of the data frames are shown below

Data 1

enter image description here

There are three things I am trying to check:

  1. 1st Check: Items that existed in Data 1, but do not exist in Data 2 [Item4; SubItem4; SubsubItem1]
  2. 2nd Check: Items that did not exist in Data 1, but do exist in Data 2 [Item6; SubItem1; SubsubItem1]
  3. 3rd Check: Item that exist in both list, but has changed in value [Item2; SubItem5; SubsubItem1]

I got the first and the second check easily with anti_join()

MissingfromData2 <- anti_join(Data1,Data2, by = c("Property.1","Property.2","Property.3"))
MissingfromData1 <- anti_join(Data2,Data1, by = c("Property.1","Property.2","Property.3"))

For the 3rd check, however, I cannot seem to lock in the identifiers in the by=c("Property.1","Property.2","Property3")

When I do the following

changedValue1 <- setdiff(Data1,Data2, by = c("Property.1","Property.2","Property.3")) 

Setdiff1

changedValue2 <- setdiff(Data2,Data1, by = c("Property.1","Property.2","Property.3"))

Setdiff2

I get the additional row (from check 1 and check 2), which I do not need.

How do I obtain the result for only changed values?

Mr.CR
  • 85
  • 8
  • Please post a minimal reproducible example (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Kay Sep 15 '20 at 12:09
  • 2
    You should post a minimal reproducible example, and not screenshots of the data. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Pedro Faria Sep 15 '20 at 12:13

1 Answers1

1

I found the solution to the problem. All I needed to add to the code above was the following bit of code

Result <- setdiff(changedValue2,MissingfromData1, by = c("Property.1","Property.2","Property.3"))

which rendered the only row in changedValue2 which was missing from MissingfromData1 with the desired difference in the value columns.

Mr.CR
  • 85
  • 8