A simple way to do this is to maintain a name lookup table that defines the mapping from names in your data sets to a common set of reference names. This table will have two columns:
- A
name
column for the names in your data sets.
- A
reference
column for the corresponding reference names to which you want to map.
For example, this name lookup table can be maintained as a CSV file named names.table.csv
:
name, reference
Year, YEAR
Month, MONTH
Revenue, Total Revenue
YEAR, YEAR
MONTH, MONTH
TOT.REV, Total Revenue
This CSV file can then be read into R:
names.table <- read.csv("/path/to/names.table.csv", header=TRUE, strip.white=TRUE, stringsAsFactors=FALSE)
print(names.table)
## name reference
##1 Year YEAR
##2 Month MONTH
##3 Revenue Total Revenue
##4 YEAR YEAR
##5 MONTH MONTH
##6 TOT.REV Total Revenue
It is important that we specify strip.white=TRUE
and stringsAsFactors=FALSE
to remove all leading and trailing white spaces and to maintain the entries as character strings.
Then, assuming that we have data sets (as data frames) that have column names that are in the name
column of the names.table
, for example:
df1 <- data.frame(Year = c(2014, 2014), Month=c("Jan", "Feb"), Revenue=c(124.5, 123.5), stringsAsFactors=FALSE)
print(df1)
## Year Month Revenue
##1 2014 Jan 124.5
##2 2014 Feb 123.5
df2 <- data.frame(YEAR = c(2015, 2015), MONTH=c("Jan", "Feb"), TOT.REV=c(154.5, 132.5), stringsAsFactors=FALSE)
print(df2)
## YEAR MONTH TOT.REV
##1 2015 Jan 154.5
##2 2015 Feb 132.5
df3 <- data.frame(YEAR = c(2016, 2016), Month=c("Jan", "Feb"), TOT.REV=c(154.5, 132.5), stringsAsFactors=FALSE)
print(df3)
## YEAR Month TOT.REV
##1 2016 Jan 154.5
##2 2016 Feb 132.5
mapping those names can be performed by:
colnames(df1) <- names.table$reference[match(colnames(df1), names.table$name)]
print(df1)
## YEAR MONTH Total Revenue
##1 2014 Jan 124.5
##2 2014 Feb 123.5
colnames(df2) <- names.table$reference[match(colnames(df2), names.table$name)]
print(df2)
## YEAR MONTH Total Revenue
##1 2015 Jan 154.5
##2 2015 Feb 132.5
colnames(df3) <- names.table$reference[match(colnames(df3), names.table$name)]
print(df3)
## YEAR MONTH Total Revenue
##1 2016 Jan 154.5
##2 2016 Feb 132.5
Hope this helps.