More than the programming, I am lost on the right approach for this problem. I have 2 data frames with a market name column. Unfortunately the names vary by a few characters / symbols in each column, for e.g. Albany.Schenectady.Troy = ALBANY, Boston.Manchester = BOSTON.
I want to standardize the market names in both data frames so I can perform merge
operations later.
I thought of tackling the problem in two steps: 1) Create a vector of the unique market names from both tables and use that to create a look up table. Something that looks like:
Table 1 Markets > "Albany.Schenectady.Troy" , "Albuquerque.Santa.Fe", "Atlanta" . . . .
Table2 Markets > "SPOKANE" , "BOSTON" . . .
I tried marketnamesvector <- paste(unique(Table1$Market, sep = "", collapse = ","))
but that doesn't produce the desired output.
2) Change Market names in Table 2 to equivalent market names in Table 1. For any market name not available in Table 1, Table 2 should retain the same value in market name.
I know I could use a looping function like below but I still need a lookup table I think.
replacefunc <- function (data, oldvalue, newvalue) {
newdata <- data
for (i in unique(oldvalue)) newdata[data == i] <- newvalue[oldvalue == i]
newdata
}
Table 1: This table is 90 rows x 2 columns and has 90 unique market names.
Market Leads Investment Leads1 Leads2 Leads3
1 Albany.Schenectady.Troy NA NA NA NA NA
2 Albuquerque.Santa.Fe NA NA NA NA NA
3 Atlanta NA NA NA NA NA
4 Austin NA NA NA NA NA
5 Baltimore NA NA NA NA NA
Table 2 : This table is 150K rows x 20 columns and has 89 unique market names.
> df
Spot.ID Date Hour Time Local.Date Broadcast.Week Local.Hour Local.Time Market
2 13072765 6/30/14 0 12:40 AM 2014-06-29 1 21 9:40 PM SPOKANE
261 13072946 6/30/14 5 5:49 AM 2014-06-30 1 5 5:49 AM BOSTON
356 13081398 6/30/14 10 10:52 AM 2014-06-30 1 7 7:52 AM SPOKANE
389 13082306 6/30/14 11 11:25 AM 2014-06-30 1 8 8:25 AM SPOKANE
438 13082121 6/30/14 8 8:58 AM 2014-06-30 1 8 8:58 AM BOSTON
469 13081040 6/30/14 9 9:17 AM 2014-06-30 1 9 9:17 AM ALBANY
482 13080104 6/30/14 12 12:25 PM 2014-06-30 1 9 9:25 AM SPOKANE
501 13082120 6/30/14 9 9:36 AM 2014-06-30 1 9 9:36 AM BOSTON
617 13080490 6/30/14 13 1:23 PM 2014-06-30 1 10 10:23 AM SPOKANE