-2

I have an excel file, and I want to read a column (i.e first 128 values of third column) line by line and save it in a list. I have written this code to read each line and extract the values but it doesn't work properly:

con<-file("D:\\MA\\excel_mix_meiningen.xls","r")
datalist<-list()
m<-list()
# which column
spalte<-3
#How many values?
for(i in 1:128)
{

  line<-readLines(con,n=1,warn=FALSE)
  datalist<-c(datalist,sapply(line,"[[",spalte))

}
close(con)
datalist

I am seeing this error:

Error in FUN("ÐÏ\021ࡱ"[[1L]], ...) : subscript out of bounds

I know, there is some package to do this job, but I want to use this methode ;)

Kaja
  • 2,962
  • 18
  • 63
  • 99
  • but as I said, I don't want to use packages. I want to use this code :) – Kaja Nov 07 '13 at 15:51
  • 1
    You cannot. The only way to read an Excel file is to use code specifically designed to read that file format. – joran Nov 07 '13 at 15:54
  • Take a look at this package http://cran.r-project.org/web/packages/XLConnect/index.html – Stedy Nov 07 '13 at 15:55
  • I have tried it but I got another error: http://stackoverflow.com/questions/19769954/reading-data-from-excel-file-in-r – Kaja Nov 07 '13 at 15:59
  • 3
    Your Excel file is rather large. Both XLConnect and xlsx (or rather, the Java interface to Excel) performs extremely poorly for even medium sized .xlsx files. Generally, I won't even bother trying if the file is even 1/10th the size of yours. I've had better luck with the older Excel format (.xls). And of course, you can always export to csv. – joran Nov 07 '13 at 16:02
  • @joran is right, Kaja. Excel uses a specific, proprietary format (some version of XML, I believe). So long as your file stays in that format, it can only be loaded w/ code specifically designed to deal w/ it. – gung - Reinstate Monica Nov 07 '13 at 16:02
  • @joran thanks, it works with xls files. – Kaja Nov 08 '13 at 09:04

1 Answers1

1

You don't have to use any special package if you don't want to, but your code is an inefficient strategy. First, save the worksheet that you are interested in as a tab delimited text file or csv file, etc. Then, simply use ?read.table, or one of its variants, with the argument nrows=128. At this point you assign only the column you want to keep to a list. You can remove the loaded data frame with ?rm.

gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79
  • I have a big excel file, and `read.table()` can not load the file – Kaja Nov 07 '13 at 15:54
  • @joran which one? my code or gung idee? – Kaja Nov 07 '13 at 15:56
  • @joran, your right. I had missed the part about the Excel file. I have adapted my answer. – gung - Reinstate Monica Nov 07 '13 at 15:57
  • @Kaja Yours will never work (directly). Now that gung has edited to clarify that the Excel file must be saved (exported) into an entirely different file format, his method (read.table or read.csv) will work. – joran Nov 07 '13 at 15:57
  • @Kaja, there is no problem with your file being too big; you just need to use the `nrows=128` argument. R will only read in the first 128 rows, which it should not have any trouble fitting in memory. – gung - Reinstate Monica Nov 07 '13 at 15:58
  • @gung I have saved my sheet in CSV but the code doesn't return me the correct values. The values which I dont have in my sheet – Kaja Nov 07 '13 at 16:01
  • Can you provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – gung - Reinstate Monica Nov 07 '13 at 16:04