-2

BACKGROUND: I have two data frames that two researchers have used to manually input time data that tracks how a group of participants reach a consensus in making a decision. We are doing this by logging the time of each preference statement as well as the preference (ranked by priority).

QUESTION: My question is, what functions or packages can I use to show me the discrepancies in the two data tables.

EXAMPLE:

discrepancies <- show_discrepancies(myData1, myData2)

discrepancies

outputExample1

provides a data frame containing only the entries that do not match

outputExample2

provides a combined data frame, with entries from both myData1 and myData2, and the entries that do not have a match are highlighted red

either output would work but I would prefer outputExample1 if possible

Community
  • 1
  • 1
  • 1
    Questions asking for tool recommendations are off-topic here, so this will likely be closed. (However, look at the **daff** package; it's pretty slick.) – joran Oct 31 '18 at 19:53
  • 1
    Questions just asking to recommend functions/packages are considered off topic. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions (if packages/functions exist to help, they will be included in the answer) – MrFlick Oct 31 '18 at 19:54
  • Welcome to SO, Chris. Please [take the tour](https://stackoverflow.com/tour) and read the help center article on [how to ask a good question](https://stackoverflow.com/help/how-to-ask). As @MrFlick says, a good question will have sample input and desired output, rather than just text describing the behavior you're looking for. – De Novo Oct 31 '18 at 20:05
  • This should be straightforward if you have one or more "key" variables that identify which observations from the first table should be paired with which observations from the second. First you'd join the two tables, to get a new combined table. Then you could filter to just show observations where the values differ. https://dplyr.tidyverse.org/reference/join.html – Jon Spring Oct 31 '18 at 20:09
  • @JonSpring Thank you! I read the documentation and this is EXACTLY what I need. I will give it a shot at work today. – Chris Larosee Nov 01 '18 at 13:52
  • @JonSpring, Thank you again. the dplyr method was successful! – Chris Larosee Nov 06 '18 at 17:53

1 Answers1

-1

Assuming the two data frames have the same structure, you can get outputExample1 using the following function:

show_discrepancies <- function(data1, data2) {
  data <- rbind(data1, data2)
  data[!duplicated(data),]
}

Also take a look at the join functions available in the dplyr package.

Tom
  • 532
  • 3
  • 11