4

There are many posts about XLConnect and reading excel files in R with XLConnect like How to read multiple excel sheets in R programming?, including rbind function, but no one answers this question:

If i had multiple excel .xls files in a directory how can i use a loop for reading and merging each one in order?

I have a directory so i do this:

 setwd("C:/Users/usuario/Desktop")
  library(rjava)
  library(XLConnect)

That directory has 28 excel files named like this:

 Bitacora_Metrocali_01_02_2014C
 Bitacora_Metrocali_02_02_2014C
         .    ...    ...
 Bitacora_Metrocali_28_02_2014C

So i need to merge them using the function: Merge(x,y,all=T)

So it can add new columns to the dataframe. The thig is, i need a dataframe that starts merging the first wht the second,and then adding all the new sheets in order. All excel files of interest are in sheet 1.

THX!

Community
  • 1
  • 1
JULIAN
  • 47
  • 1
  • 2
  • 7
  • I can read them, and merge too, i mean, one per one. But i have too many files. In the example i have 28, but there are more...for real. I need a loop to do that. Thx – JULIAN Mar 14 '14 at 02:48

2 Answers2

3

Does this work for you:

# This will give you a vector of the names of files in your current directory 
# (where I've assumed the directory contains only the files you want to read)
data.files = list.files()

# Read the first file
df = readWorksheetFromFile(file=data.files[1], sheet=1)

# Loop through the remaining files and merge them to the existing data frame
for (file in data.files[-1]) {
    newFile = readWorksheetFromFile(file=file, sheet=1)
    df = merge(df, newFile, all=TRUE)
}
eipi10
  • 91,525
  • 24
  • 209
  • 285
  • It gets some as.posixct errors but that is cuz the excel file is so complex. Thx! I did with others and works. – JULIAN Mar 14 '14 at 21:07
2

Here's an lapply and Reduce approach I am using the the read.xls from gdata package, as you mentioned xls files. If it is xlsx instead, substitute read.xls with readWorksheetFromFile and load the appropriate libraries.

library(gdata)
data.files = list.files(pattern = "*.xls") #get list of files
data.to.merge <- lapply(files, read.xls) #read in files using lapply
merged.data <- Reduce(function(...) merge(..., all = T),data.to.merge)#merge all the files

The merged.data will have data from all the sheets, and will also handle the case of files with different headers.

infominer
  • 1,981
  • 13
  • 17