I have two data frames to compare. Screenshots of the data frames are shown below
There are three things I am trying to check:
- 1st Check: Items that existed in Data 1, but do not exist in Data 2 [Item4; SubItem4; SubsubItem1]
- 2nd Check: Items that did not exist in Data 1, but do exist in Data 2 [Item6; SubItem1; SubsubItem1]
- 3rd Check: Item that exist in both list, but has changed in value [Item2; SubItem5; SubsubItem1]
I got the first and the second check easily with anti_join()
MissingfromData2 <- anti_join(Data1,Data2, by = c("Property.1","Property.2","Property.3"))
MissingfromData1 <- anti_join(Data2,Data1, by = c("Property.1","Property.2","Property.3"))
For the 3rd check, however, I cannot seem to lock in the identifiers in the by=c("Property.1","Property.2","Property3")
When I do the following
changedValue1 <- setdiff(Data1,Data2, by = c("Property.1","Property.2","Property.3"))
changedValue2 <- setdiff(Data2,Data1, by = c("Property.1","Property.2","Property.3"))
I get the additional row (from check 1 and check 2), which I do not need.
How do I obtain the result for only changed values?