2

Currently, I'm working with Excel workbooks for > 15 sheets. I use the following sequence:

#libraries
library(data.table)
library(openxlsx)    

#load excel file
    wb <- loadWorkbook("Data.xlsx")

    #write sheets into a list of data frames
    name <- names(wb)
    df <- list()
    for (i in 1:length(name)){

      d <- lapply(i, function(i) readWorkbook(wb, sheet = name[i], startRow = 1, colNames = TRUE, rowNames = FALSE, detectDates = TRUE, skipEmptyRows = TRUE, skipEmptyCols = TRUE, rows = NULL, cols = NULL, check.names = FALSE, namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE))

      df <- append(df, d)
    }



    #assign names to columns
    names(df) <- name

    #merge DFs into one list
    DT <- rbindlist(df)

Now, I've been working mostly with very small data sets, so performance is not an issue. However, I was wondering if there's an easier way to perform this, in case I'm working with big data sets.

Thanks a lot for your input already.

zonfl
  • 328
  • 2
  • 8
  • Why are you using both a `for` loop and `lapply`? – Axeman Oct 10 '17 at 09:40
  • Something like this answer - https://stackoverflow.com/a/11433532/3022126 - could be very easily adapted to load multiple Excel workbooks/sheets – Phil Oct 10 '17 at 09:52
  • Without the loop, it keeps overwriting `d`: `List of 1 $ :'data.frame': 490 obs. of 3 variables: ..$ datetime: num [1:490] 42866 42866 42866 42866 42866 ... ..$ ID : chr [1:490] "t13" "t13" "t13" "t13" ... ..$ value : chr [1:490] "8 mins" "8 mins" "8 mins" "8 mins" ... ` But the list should have 13 entries in this case. Therefore, I added the loop to keep appending the imported values instead of overwriting them. Is this possible merely with lapply? – zonfl Oct 10 '17 at 09:52
  • @Phil that looks like a good one, I'll dig through it, thanks – zonfl Oct 10 '17 at 09:54
  • Yes, something like `l <- lapply(names(wb), function(x) readWorkbook(wb, sheet = x)); DT <- rbinlist(l)` – Axeman Oct 10 '17 at 10:00
  • `DT <- lapply(names(wb), function(x) readWorkbook(wb, sheet = x))` works like a charm. I also found the mistake: `d <- lapply(i, function(i) readWorkbook(wb, sheet = name[i]))` This function refers only to the sheet from the iteration `i`, instead of all the sheets in `names(wb)`. Directly adressing names `names(wb)` allows `lapply` to iterate over the sheets of the workbook directly. `d <- lapply(names(wb), function(x) readWorkbook(wb, sheet = x))` It seems that I didn't really understand how lapply works. Do you want to write it as an answer that I accept so you get the credit? – zonfl Oct 10 '17 at 10:15

0 Answers0