I have two data-frames
One data-frame has a column 'zip codes' which has the complete zip codes. It also has several other columns like shop names, etc.
The 2nd one has a column called zip codes range which just has the range for zip codes for each city.
How do I join these two data-frames on zip codes such that I can add the correct cities to the dataframe 1?
I can think of the nested for loop and comparing the max/min of each range to the zip codes in the 2nd data-frame. But that is taking a long time to run ~ 100mill comparisons
Edit: Data frame 1: This one only has zip codes. I want the cities to be populated here.
| Shop names | Zip Codes |
|-----------------------|-----------|
| Bergin and botts | 029888 |
| WW and Co | 100397 |
| Higgin Bothams | 100430 |
| Bertie's Beans | 100459 |
| Leaky Cauldron | 310283 |
| Pet Peeves | 310330 |
| Lucy's coffee shop | 910345 |
| Dream cathers | 465250 |
| Dragon supplies | 479187 |
| SLUG AND JIGGER'S | 934464 |
| FLOURISH AND BLOTTS. | 937833 |
| MADAM MALKIN'S ROBES | 931283 |
Dataframe2: This one has the zip code ranges and the corresponding cities.
| City | Zip ranges |
|----------------|---------------|
| braavos | 029918-100290 |
| highgarden | 100389-100440 |
| vale | 200410-219000 |
| dorne | 310229-367890 |
| storms end | 389032-567000 |
| king's landing | 601000-898000 |
| winterfell | 910230-940200 |
I created some sample data here. the original data has around a million rows for dataframe1 and 5k rows for dataframe2. Hence the for loop logic will be very cumbersome.
Appreciate any help!
dput(df1)
structure(list(ï...Shop.names = c(" Bergin and botts ",
" WW and Co ", " Higgin Bothams ", " Bertie's Beans ",
" Leaky Cauldron ", " Pet Peeves ", " Lucy's coffee shop ",
" Dream cathers ", " Dragon supplies ", " SLUG AND JIGGER'S ",
" FLOURISH AND BLOTTS. ", " MADAM MALKIN'S ROBES "), Zip.Codes = c("29888",
"100397", "100430", "100459", "310283", "310330", "910345", "465250",
"479187", "934464", "937833", "931283")), class = "data.frame", row.names = c(NA,
-12L))
dput(df2)
structure(list(ï...City = c(" braavos ", " highgarden ",
" vale ", " dorne ", " storms end ", " king's landing ",
" winterfell "), Zip.ranges = c(" 029918-100290 ", " 100389-100440 ",
" 200410-219000 ", " 310229-367890 ", " 389032-567000 ", " 601000-898000 ",
" 910230-940200 ")), class = "data.frame", row.names = c(NA,
-7L))