0

I am importing some columns from multiple csv files from R. I want to delete all the data after row 1472.

temp = list.files(pattern="*.csv")    #Importing csv files
Normalyears<-c(temp[1],temp[2],temp[3],temp[5],temp[6],temp[7],temp[9],temp[10],temp[11],temp[13],temp[14],temp[15],temp[17],temp[18],temp[19],temp[21],temp[22],temp[23])
leapyears<-c(temp[4],temp[8],temp[12],temp[16],temp[20])      #separating csv files with based on leap years and normal years.

Importing only the second column of each csv file.

myfiles_Normalyears = lapply(Normalyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
myfiles_leapyears = lapply(leapyears, read.delim,colClasses=c('NULL','numeric'),sep =",")

new.data.leapyears <- NULL

for(i in 1:length(myfiles_leapyears)) { 
 in.data <-      read.table(if(is.null(myfiles_leapyears[i])),skip=c(1472:4399),sep=",")
 new.data.leapyears <- rbind(new.data.leapyears, in.data)}

the loop is suppose to delete all the rows starting from 1472 to 4399.

  Error: Error in read.table(myfiles_leapyears[i], skip = c(1472:4399), sep = ",") : 

'file' must be a character string or connection

Cricketer
  • 399
  • 1
  • 3
  • 20
  • is `myfiles_leapyears[i]` a string or a connection? You sure that's not where the error is? Actually look closer at your code, why do you have the `if` in the read.table call like that? Can you also make it reproducible? – Josh W. Jun 03 '15 at 21:37
  • There is no way, I can upload csv files here. Also, even when I get rid of "if" statement, the error is the following: Error in myfiles_leapyears[i] : invalid subscript type 'list' – Cricketer Jun 03 '15 at 22:12
  • You don't have to upload all of the csv's. Please see here: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Josh W. Jun 03 '15 at 22:45

3 Answers3

1

There is a nrows parameter to read.table, so why not try

read.table(myfiles_leapyears[i], nrows = 1471,sep=",")

Josh W.
  • 1,123
  • 1
  • 10
  • 17
0

It is fine. I just turned the data from a list into a dataframe.

     df <- as.data.frame(myfiles_leapyears,byrow=T)

     leap_df<-head(df,-2928)
Cricketer
  • 399
  • 1
  • 3
  • 20
0

Your myfiles_leapyears is a list. When subsetting a list, you need double brackets to access a single element, otherwise you just get a sublist of length 1.

So replace

myfiles_leapyears[i]

with

myfiles_leapyears[[i]]

that will at least take care of invalid subscript type 'list' errors. I'd second Josh W. that the nrows argument seems smarter than the skip argument.

Alternatively, if you define using sapply ("s" for simplify) instead of lapply ("l" for list), you'll probably be fine using [i]:

myfiles_leapyears = lapply(leapyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294