I'm switching to R from excel and was wondering how to do this in R.
I have a dataset that looks something like this:
df1<-data.frame(Zipcode=c("7941AH","7941AG","7941AH","7941AZ"),
From=c(2,30,45,1),
To=c(20,38,57,8),
Type=c("even","mixed","odd","mixed"),
GPS=c(12345,54321,11221,22331))
df2<-data.frame(zipcode=c("7914AH", "7914AH", "7914AH", "7914AG","7914AG","7914AZ"),
housenum=c(18, 19, 50, 32, 104,11))
First dataset contains zipcode, house number range (from and to), type meaning if the range contains even, odd or mixed house numbers and gps coordinates. Second dataset contains only address (zipcode, house number).
What I want to do is to lookup gps coordinates for df2. For example address with zipcode 7941AG and housenumber 18 (even number between 2 and 20) has gps coordinate 12345.
Update: As it didn't cross my mind that the size of the dataset is important for the chosen solution (I know, bit naive...) here some extra information: Actual size of df1 is 472.000 observations and df2 has 1.1 million observations. The number of unique zipcodes in df1 is 280.000. I stumbled upon this post speed up the loop operation in R with some interesting findings, but I don't know how to incorporate this in the solution provided by @josilber