0

I essentially wrote this short function for importing all sheets from an excel file to dataframes in a list, and adding names to the list items:

read_full_excel <- function(x){ 

# Imports the name of the sheets in the excel file and creates a list of data frames:  
  sheet_names <- readxl::excel_sheets(x) 

# Create list and add each excel sheet as a data frame by a loop:
  sheets <- list()
  for(i in 1:length(sheet_names)){
    sheets[[i]] <-  readxl::read_xlsx(x, sheet = sheet_names[i])
  }

# Add the sheet names to the list: 
  names(sheets) <- sheet_names

  return(sheets)

}

Now, the function successfully creates a list of data frames from the excel sheets, but for some reason the names are not applied. I've run names(list_name) <- excel_sheets("file.xlsx") outside the function, after the list was created, and for some reason this works. So, why won't it work inside the function?

Pål Bjartan
  • 793
  • 1
  • 6
  • 18
  • Just tried it on an excel sheet and your function worked for me. [Here](https://stackoverflow.com/a/12945838/12400385) is a similar answer which basically uses the same approach, but `lapply` rather than a `for` loop – nniloc May 13 '20 at 23:19
  • That's strange, it doesn't work for me. Your link seems to be the answer I was looking for. I did try to figure out `lapply` method to accomplish this, but I couldn't figure out how to `lapply` the names list to a conditional argument of `read_xlsx`. I'm still having trouble understanding the syntax though: `x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))`. Where does that `X` variable come from? How does the function know that it is supposed to get the sheet name from `sheets` this way? Sorry for nagging, I'm just trying to understand this. – Pål Bjartan May 14 '20 at 08:06
  • In lapply or sapply, the function(X) is applied to each element of vector sheets, which is equivalent to write a loop `for (i in 1:length(sheets))` and apply `funtion(sheets[i])`. If function need more parameters, they can added after `function(X) {}`. – Marcelo Fernando Befumo May 14 '20 at 15:07

1 Answers1

1

It's a problem in Windows, not Linux (I guess MacOS doesn't either). You need to provide the full path to the excel file.

# This is not working
> read_full_excel("TEST.xslx")
 Error: `path` does not exist: ‘TEST.xslx’ 

# This is OK
> read_full_excel("C:\\Users\\marcelo\\Documents\\TEST.xlsx")

# you can use your function interactively
read_full_excel(file.choose())

# or add this optionally in your function
read_full_excel <- function(x = NULL){ 

# Imports the name of the sheets in the excel file and creates a vector:  
  if (is.null(x) {
      x <- file.choose()
   } 
   sheet_names <- readxl::excel_sheets(x)

   # continue the function code as before ........ 

}

# usage
read_full_excel()
# or 
read_full_excel("C:\\my_full_path_to_excel_file") 

It seems to work well for me (in Debian Linux):

> zz<-read_full_excel("./Documentos/test.xlsx")
> names(zz)
[1] "Hoja1" "Hoja2" "Hoja3" "Hoja4"

Just as alternative reading and naming list elements by using sapply:

read_full_excel <- function(x){ 

  # Imports the name of the sheets in the excel file and creates a vector:  
  sheet_names <- readxl::excel_sheets(x) 

  # Alternative equivalent function using sapply. Note the inverted order 
  # of arguments x ans sheet_names for the internal function
  # See sapply parameter USE.NAMES = TRUE
  sapply(
    sheet_names, 
    function(s_name, x) { readxl::read_xlsx(path = x, sheet = s_name) },
    x, 
    USE.NAMES = TRUE
    )
}

  • Strange, it will not work for me. I tried your solution instead, as I was trying to do something like that at first. However, I still won't get the sheet names applied to the list. I see you added a variable `s_name`, which I can't see declared anywhere. I assumed that was a typo, so I substituted it with `sheet_names` when I re-wrote the function. I was also wondering why you use `sapply` instead of `lapply`? Is there an advantage to this? – Pål Bjartan May 14 '20 at 08:35
  • s_name is a variable of the anonymous function in sapply, it is ok, Sheet_names is applied into the anonymous function through s_name. I used sapply instead of lapply because of sapply has the logical pararmeter USE.NAMES which names each element of the list whith te respective sheet name. Of course you can use lapply and then run `names(sheets) <- sheet_names`. – Marcelo Fernando Befumo May 14 '20 at 14:38
  • I've updated my previous answer, in Windows you need to provide the full path to the excel file – Marcelo Fernando Befumo May 14 '20 at 15:32
  • Nice solution with `sapply`! – nniloc May 14 '20 at 19:55