0

I have two lists, one with the excel file paths that I would like to read and another list with the file names that I would like to assign to each as a dataframe. Trying to create a loop using the below code but the loop only creates a single dataframe with name n. Any idea how to make this work?

 files <- c("file1.xlsx","file2.xlsx")
 names <- c('name1','name2')

 for (f in files) {
 for (n in names)   {
 n <- read_excel(path = f)
  }
 }
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Matt Gossett
  • 184
  • 3
  • 13
  • 2
    It has been said many times on R posts of SO to save many similar data frames into a single list and not as separate objects flooding your global environment. See a [canonical answer](https://stackoverflow.com/a/24376207/1422451). – Parfait Mar 04 '20 at 14:14
  • 1
    Change `read.csv` to `read_excel` or `read_xlsx` in the linked post. – NelsonGon Mar 04 '20 at 14:21

2 Answers2

2

You are overwriting n on each iteration of the loop

Edit:

@Parfait commented that we shouldn't use assign if we can avoid it, and he is right (e.g. why-is-using-assign-bad)

This does not use assign and puts the data in a neat list:

files <- c("file1.xlsx","file2.xlsx")
names <- c('name1','name2')

result <- list()
for (i in seq_along(files)) {
  result[names[i]] <- read_excel(path = files[i]))
}

Old and not recommended answer (only left here for transparency reasons):

We can use assign to use a character string as variable name:

files <- c("file1.xlsx","file2.xlsx")
names <- c('name1','name2')

for (i in seq_along(files)) {
  assign(names[i], read_excel(path = files[i]))
}
dario
  • 6,415
  • 2
  • 12
  • 26
  • 1
    This won't work as it stands because of the double loop. `file2.xlsx` would be written to both `name1` and `name2`. – Miff Mar 04 '20 at 14:08
  • @Miff Thanks good catch! I edited my answer – dario Mar 04 '20 at 14:09
  • 3
    `assign` is nearly same as `attach` as methods in R to avoid. – Parfait Mar 04 '20 at 14:17
  • @Parfait Thank you for your comment. I tried to account for it in my edit. – dario Mar 04 '20 at 14:24
  • 2
    The edit (using a pre-allocated `result <- list()`) is *much* better, addressing the biggest mistakes in efforts like this: (1) no "growing objects"; (2) no use of `get` and/or `assign`. @dario, if you're going to keep the original answer, would you consider moving it to the end so that the preferred answer is first? – r2evans Mar 04 '20 at 15:05
1

An alternative is to loop through all Excel files in a folder, rather than a list. I'm assuming they exist in some kind of folder, somewhere.

# load names of excel files 
files = list.files(path = "C:/your_path_here/", full.names = TRUE, pattern = ".xlsx")

# create function to read multiple sheets per excel file
read_excel_allsheets <- function(filename, tibble = FALSE) {
  sheets <- readxl::excel_sheets(filename)
  sapply(sheets, function(f) as.data.frame(readxl::read_excel(filename, sheet = f)), 
         simplify = FALSE)
}

# execute function for all excel files in "files"
all_data <- lapply(files, read_excel_allsheets)

Updated...

ASH
  • 20,759
  • 19
  • 87
  • 200
  • 1
    Strongly recommend *against* this methodology: see Chapter 2: Growing Objects in [The R Inferno](https://www.burns-stat.com/pages/Tutor/R_inferno.pdf). – r2evans Mar 04 '20 at 15:01
  • I thought that code worked, but I didn't test it. I just updated my answer with a sample of code that seems to work well. I'm really much more into Python these days. I used to do some R coding, but I hardly touch on it anymore. – ASH Mar 04 '20 at 18:39
  • Even in Python you want to avoid [growing object in a loop](https://stackoverflow.com/a/36489724/1422451). – Parfait Mar 04 '20 at 19:01