I want to create a new variable (Sex) in a dataframe by matching up the ID's between my main data frame and a reference data frame which contains information about the sex of each individual.
I have the following code that works - but as my data frame is over 6 million rows long it is taking over 10 hours to run.
data.df$Sex_Varb <- NA
for(i in 1:nrow(data.df)){
find.match <- which(data.df$ID_Varb[i] == Reference_Dat$ID_Varb)
if(length(find.match) != 0){
data.df$Sex_Varb[i] <- Reference_Dat$Sex[find.match]
}
}
Is there a faster way to create a new variable based on the matching values between two datasets?