4

we have more than 7000 excel data files for .xlsx(2010). my R version is R 2.15.0. if i do manual operation to convert xlsx to xlx, .cvs, or txt., it will spend more time to do it .

Zhiqiang Chen
  • 49
  • 1
  • 6

2 Answers2

1

I have not used XLSconnect but my students used the package xlsx. Then either the function read.xlsx or read.xlsx2 will read in the xls sheet. This package has options for reading and writing xls format and can read and write specific sheets in the spreadsheet and specific regions.

The only formal guidance I gave my students (biology sophomores) for using this package is that the spreadsheet must be 'well formed'. (all items are data not formulas, first row is the variable name in lower case without any non-letter characters, and rows 2-## have the data for each variable. If it is a record than all the items for the same record are on the same row) It doesn't have *.xls to be this strict but I wanted the minimum of problems for the students as they read their data files.

Schumacher
  • 41
  • 1
  • 1
    The packages `xlsx` and `XLConnect` uses the same Java libraries. But in my view, XLConnect is *much* better (i.e. better usability as well as better coverage of the library) – Andrie May 08 '12 at 14:41
  • Nice to know, I will investigate using this with my students. – Schumacher May 08 '12 at 17:11
  • I use library("XLConnect"),but it appear false. and ask me to re-installment the java. so i re-install the java, but it also can't run. Loading required package: XLConnectJars Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rJava', details: call: stop("No CurrentVersion entry in '", key, "'! Try re-installing Java and make sure R and Java have matching architectures.") error: object 'key' not found Error: package ‘rJava’ could not be loaded – Zhiqiang Chen May 09 '12 at 11:52
1

Function read.xls in the gdata package will read xlsx and xls files into R. I use this frequently.

Sounds like you've got a lot of Excel files to work with, here's what I do to get a large number of these files (both xlsx and xls) into R:

Set working directly to location of my Excel files

setwd("F:\\ address of folder with all my Excel files")

Make a list of all files in working directory

MyFiles <- list.files()

Check the list

MyFiles

Makeke a list that contains all the data from the xls and xlsx files contained in the working dir. This is like a batch data import function.

library(gdata)
Mylist <- lapply(MyFiles, read.xls) 

Check that this read all the files in the folder, in case some Excel files are corrupt, etc. If the result is FALSE then there is a problem.

identical(length(MyFiles), length(Mylist))

Then I carry on with sapply, etc. to perform functions on the data in the files.

Ben
  • 41,615
  • 18
  • 132
  • 227
  • 1
    Yes, when I upload the gdata package , only using this sentence "a<-read.xls("file.name.xlsx",sheet=3)", all the problem be solved. – Zhiqiang Chen May 08 '12 at 15:31
  • @ZhiqiangChen I've added a few details on batch importing large numbers of Excel files into `R`, that might be useful for you also. – Ben May 08 '12 at 17:18
  • 1
    Yesterday i use gdata package to read the xlsx. The speed is very slow. noly one file, it needs to run 2 seconds. the programs as following: library("gdata") for (i in 1:7000){ file.name<-paste("C:/Users/zhen0001/Desktop/Silciscan data/B4E_",i,"_SilviScan_Report.xlsx",sep="") possibleError=tryCatch(data<- read.xls(file.name,sheet=3),error= function(e) e) if (inherits(possibleError,"error")) next a=min(data[1]) b=max(data[1]) c<-b-a+1 d<-c(i,a,b,c) print(d) } – Zhiqiang Chen May 09 '12 at 11:01
  • You should ask another question separate to this one. Something like "how to speed up import of excel files?" – Ben May 09 '12 at 14:21