1

I am trying to reformat an entire directory of xlsx files using R.

I initially created a code that could do this for a single file but this method would require me to manually update/input each separate file name from the file directory in the code. The code that I created for single formats relied on data.table for reformatting the xlsx file.

Now I am trying to use the method below:

FilesDirec = list.files(pattern = "\\.xlsx")

newcols <- c("WORD", "START", "END", "H/W", "IO TYPE", "ICD TYPE", "ICD DESCRIPTION")

df_list <- list()
for(i in seq_along(FilesDirec)){
  df_list[[FilesDirec[i]]] <- read.xlsx(FilesDirec[i], sheetIndex = 1, header = TRUE)
  colnames(df_list[[FilesDirec[i]]]) <- newcols
}

The for loop reads in all xlsx files in directory to df_list and relabels the headers to uppercase.

I now need to add in additional columns to the data sets contained in df_list. The additional columns that are created reference the 5th column of the xlsx files for their row data.

The code I originally wrote to do this was

icd <- read.xlsx("SomeFile.xlsx", sheetIndex = 1, header = TRUE)
xx <- icd[,c(5)]

DT <- data.table(icd)

# This is the required column for "CAN TYPE"

DT[, 'CAN TYPE' := ifelse( xx %in% c("AI", "DI", "EN", "BI"), "CI",
                       ifelse( xx %in% c("DO", "AO", "OUT", "BO"), "CO", NA))]

So my question is how do I reference the 5th column for all xlsx file data located in df_list in order to create new columns based off the 5th column information for the entire directory of xlsx files?

ryry
  • 86
  • 9
  • nested `ifelse` are painful, see http://stackoverflow.com/q/36390967/3576984 for a workaround – MichaelChirico Apr 10 '16 at 16:31
  • You can use `lapply` to cycle through your list of data.frames or data.tables, df_list, however you decided to store them. You can write a function that returns the data set with the new "CAN TYPE" variable. Then use that function within `lapply`. – lmo Apr 10 '16 at 16:47

0 Answers0