If I correctly understand you example you have a situation similar to what I show below based on the example from the merge
function.
> (authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
nationality = c("US", "Australia", "US", "UK", "Australia"),
deceased = c("yes", rep("no", 3), "yes")))
surname nationality deceased
1 Tukey US yes
2 Venables Australia no
3 Tierney US no
4 Ripley UK no
5 McNeil Australia yes
> (books <- data.frame(
name = I(c("Tukey", "Venables", "Tierney",
"Ripley", "Ripley", "McNeil", "R Core")),
title = c("Exploratory Data Analysis",
"Modern Applied Statistics ...", "LISP-STAT",
"Spatial Statistics", "Stochastic Simulation",
"Interactive Data Analysis",
"An Introduction to R"),
deceased = c("yes", rep("no", 6))))
name title deceased
1 Tukey Exploratory Data Analysis yes
2 Venables Modern Applied Statistics ... no
3 Tierney LISP-STAT no
4 Ripley Spatial Statistics no
5 Ripley Stochastic Simulation no
6 McNeil Interactive Data Analysis no
7 R Core An Introduction to R no
> (m1 <- merge(authors, books, by.x = "surname", by.y = "name"))
surname nationality deceased.x title deceased.y
1 McNeil Australia yes Interactive Data Analysis no
2 Ripley UK no Spatial Statistics no
3 Ripley UK no Stochastic Simulation no
4 Tierney US no LISP-STAT no
5 Tukey US yes Exploratory Data Analysis yes
6 Venables Australia no Modern Applied Statistics ... no
Where authors
might represent your first dataframe and books
your second and deceased
might be the value that is in both dataframe but only up to date in one of them (authors
).
The easiest way to only include the correct value of deceased
would be to simply exclude the incorrect one from the merge.
> (m2 <- merge(authors, books[names(books) != "deceased"],
by.x = "surname", by.y = "name"))
surname nationality deceased title
1 McNeil Australia yes Interactive Data Analysis
2 Ripley UK no Spatial Statistics
3 Ripley UK no Stochastic Simulation
4 Tierney US no LISP-STAT
5 Tukey US yes Exploratory Data Analysis
6 Venables Australia no Modern Applied Statistics ...
The line of code books[names(books) != "deceased"]
simply subsets the dataframe books
to remove the deceased
column leaving only the correct deceased
column from authors
in the final merge.