I am trying to use the %in% function to match the observations of one of my datasets to the observations of another dataset. Essentially, I would like to make two new datasets, one that contains the observations of the second dataset, and another which contains all other observations. Here is an example dataset:
Df
Last.Name First.Name Group
Williams Bob A
Williams Dan C
Miller Bob A
Smith Dan C
Williams Rick A
Smart Jeff C
Miller Bob A
Smith Dan C
Jones Bob A
Williams Buddy C
Miller Bob A
Hends Dan C
Williams Rick A
Smart Jeff C
Millers Bob A
Smith Danny C
Here is a dataset that I am trying to match the observations:
dfMatch
LastName FirstName
Williams Bob
Williams Buddy
Miller Bob
Smith Dan
Williams Rick
Smart Jeff
Miller Bob
Smith Dan
I tried various versions of the following code:
newdf<-Df[ Df$Last.Name %in% DfMatch$LastName & Df$First.Name %in% DfMatch$FirstName,]
and
newdf<-Df[ which(Df$Last.Name %in% DfMatch$LastName & Df$First.Name %in% DfMatch$FirstName),]
To get this new dataset:
newDF
Last.Name First.Name Group
Williams Bob A
Miller Bob A
Smith Dan C
Williams Rick A
Smart Jeff C
Miller Bob A
Smith Dan C
Williams Buddy C
However, this does not work.
I would also like to use similar code to build a dataset which includes all observations not listed in the dfMatch set, such as:
DfNoMatch
Last.Name First.Name Group
Williams Dan C
Jones Bob A
Miller Bob A
Hends Dan C
Williams Rick A
Smart Jeff C
Millers Bob A
Smith Danny C
By using code similar to:
DfNoMatch<-Df[ !Df$Last.Name %in% DfMatch$LastName & Df$First.Name %in% DfMatch$FirstName,]
and
DfNoMatch<-Df[! which(Df$Last.Name %in% DfMatch$LastName & Df$First.Name %in% DfMatch$FirstName),]
Thank you in advance and any help is greatly appreciated!