I am trying to reformat an entire directory of xlsx files using R.
I initially created a code that could do this for a single file but this method would require me to manually update/input each separate file name from the file directory in the code. The code that I created for single formats relied on data.table for reformatting the xlsx file.
Now I am trying to use the method below:
FilesDirec = list.files(pattern = "\\.xlsx")
newcols <- c("WORD", "START", "END", "H/W", "IO TYPE", "ICD TYPE", "ICD DESCRIPTION")
df_list <- list()
for(i in seq_along(FilesDirec)){
df_list[[FilesDirec[i]]] <- read.xlsx(FilesDirec[i], sheetIndex = 1, header = TRUE)
colnames(df_list[[FilesDirec[i]]]) <- newcols
}
The for loop reads in all xlsx files in directory to df_list and relabels the headers to uppercase.
I now need to add in additional columns to the data sets contained in df_list. The additional columns that are created reference the 5th column of the xlsx files for their row data.
The code I originally wrote to do this was
icd <- read.xlsx("SomeFile.xlsx", sheetIndex = 1, header = TRUE)
xx <- icd[,c(5)]
DT <- data.table(icd)
# This is the required column for "CAN TYPE"
DT[, 'CAN TYPE' := ifelse( xx %in% c("AI", "DI", "EN", "BI"), "CI",
ifelse( xx %in% c("DO", "AO", "OUT", "BO"), "CO", NA))]
So my question is how do I reference the 5th column for all xlsx file data located in df_list in order to create new columns based off the 5th column information for the entire directory of xlsx files?