How to read in multiple ".xlsx" files to R

Question

Having trouble trying to read in multiple .xlsx files to R from the same directory. I keep getting the following error.

"Error in path.expand(file) : argument "file" is missing, with no default"

My code is as follows.

require(.xlsx)
Files=list.files(path="I:/Marcs_Discretinization_try_1/Attempt1/Actual     Data", pattern=".xlsx")
sapply(Files, read.xlsx2(sheetIndex=8))

The output of object Files looks like this which seemingly does not have the attached path.

 [1] "2015-B1-2OR.xlsx"    "2015-B1-OR10-B.xlsx" "2015-B1-OR10.xlsx"   "2015-B1-OR19.xlsx"   "2015-B2-OR19.xlsx"  
 [6] "2015-O1-2OR.xlsx"    "2015-O1-OR10-B.xlsx" "2015-O1-OR10.xlsx"   "2015-O2-2OR.xlsx"    "2015-O2-OR10-B.xlsx"
[11] "2015-O2-OR10.xlsx"   "2015-X1-2OR.xlsx"    "2015-X1-OR10-B.xlsx" "2015-X1-OR10.xlsx"   "2015-X2-2OR.xlsx"   
[16] "2015-X2-OR10-B.xlsx" "2015-X2-OR10.xlsx"

If you want the full path returned, use `full.names = TRUE` in `list.files` — Ritchie Sacramento, Nov 18 '15 at 03:40
This feels closer however the same error is returned: `Error in path.expand(file) : argument "file" is missing, with no default` — MockCommunity1, Nov 18 '15 at 03:47
Your sapply syntax is wrong, additional arguments to read.xlsx2 come after the name of the function. Try looking at some examples. — joran, Nov 18 '15 at 03:50
also, your use of `sapply` is incorrect which is the source of the error. Try `sapply(Files, function(x) read.xlsx2(x, sheetIndex=8)` — Ritchie Sacramento, Nov 18 '15 at 03:50
This helped @Jay. The data files are being read in as a single massive dataframe where each row is a vector as opposed to individual dataframes. — MockCommunity1, Nov 18 '15 at 04:09
@Dilliplaine33 Many would argue that it would be better to keep them as a list but if you want to assign individual data frames, see http://stackoverflow.com/a/11433532/2835261 — Ritchie Sacramento, Nov 18 '15 at 04:47
Simpler: `sapply(Files, read.xlsx2, sheetIndex=8)`. For your last question, read `help(sapply)` and you may find your answer in `simplify`. — r2evans, Nov 18 '15 at 04:51
@r2evans, even closer! simplify=False makes a more useful dataframe. I am now fumbling with this code to generate individual dataframes but am having errors. `for (i in 1:length(Files)[1]) {assign(paste0("DF", i))=read.xlsx2(Files[i],sheetIndex=1, header=T)}` _Error in assign(paste0("DF", i)) = read.xlsx2(Files[i], sheetIndex = 1, : target of assignment expands to non-language object_ — MockCommunity1, Nov 18 '15 at 05:33
Found it!! `for (i in 1:length(Files)[1]) + {assign(paste0("DF", i), read.xlsx2(Files[i],sheetIndex=1, header=T))}` does the trick. My problem was in the layout of assign. — MockCommunity1, Nov 18 '15 at 05:46
Though it may give you something with which you can work, there are several things about that method that are actively discouraged as well as unnecessary. @Jay suggested keeping them in one data.frame, and that may work better if you were to `cbind` a column indicating the different sources. Even without that, I am inferring that you will be doing similar things to each data.frame, so another option would be to use `simplify=FALSE` (which was my point earlier) and keep them unique in a list; that way, you can apply one function to all of them in one fell swoop. Also, why `length(Files)[1]`? — r2evans, Nov 18 '15 at 06:12
@r2evans `length(Files)[1]` is an artifact of earlier code where I needed to treat something for each row hence the [1]. That may be redundant here. You are very correct in your thinking that I will be doing the same thing to each data frame so I have become very intrigued by simplify=FALSE. I have never accessed data in this fashion, how can a function be applied to all of them within themselves? — MockCommunity1, Nov 18 '15 at 06:35

score 3 · Answer 1 · answered Nov 25 '15 at 03:06

You want to merge all Excel files in a folder?

library(xlsx)
setwd("C:/Users/rshuell001/Desktop/excel_files")
data.files = list.files(pattern = "*.xlsx")
data <- lapply(data.files, function(x) read.xlsx(x, sheetIndex = 1))

for (i in data.files) {
    data <- rbind(data, read.xlsx(i, sheetIndex = 1))
}

score 2 · Answer 2 · edited Feb 24 '21 at 18:29

2

I use a lapply and bind_rows usually. Comfortable and fast.

require(tidyverse)
require(magittr)
require(readxl)

path <- "I:/project/raw_data"
  
url_xlsx <- list.files(path, pattern = "*.xlsx", recursive = TRUE)
  
read_xlsx_files <- function(x){
   df <- read_xlsx(path = paste(path, x, sep = "/"))
   return(df)
 }
  
df <- lapply(url_xlsx, read_xlsx_files ) %>%
   bind_rows()

edited Feb 24 '21 at 18:29

user2948714

671
2
9
13

answered Feb 07 '21 at 20:38

ToWii

590
5
8

How to read in multiple ".xlsx" files to R

2 Answers2

Linked

Related