-2

I need to read a csv file at R. My file has 100 lines and I want to read it of 10 in 10 line. For example:

  1. read 10 first lines
  2. read 10 lines begining in 11, because I already read 10
  3. read 10 lines begining in 22, and so on ..

I tried to use a for(i in 1:10) or while, but I can't read the file after 11, after 22 and so on..

Someone knows how can I do this?

Thanks!!

Andressa
  • 7
  • 1

2 Answers2

2

Probably answered many times before (e.g., by me), but here's some data

fl = tempfile()
dim(mtcars)
write.csv(mtcars, file=fl)

Use a connection to open the file, then read in 10 rows

fin = file(fl, open="r")
nrows <- 10
data <- read.csv(fin, nrows=nrows)      # first chunk

Remember the column names and classes

col.names <- names(data)                # remember column names and...
colClasses <- sapply(data, class)       # ... column classes

then process the chunk and read in the next chunk of data, making sure to add the header and column classes. Stop reading when there's no more data.

repeat {
    ## process data...
    cat("Read", nrow(data), "rows\n")
    ## ...then read the next chunk
    data <- read.csv(fin, header=FALSE, colClasses=colClasses,
                     col.names=col.names, nrows=nrows)
    if (nrow(data) == 0)                # done yet?
        break
}

mtcars has 32 rows, and we see

Read 10 rows
Read 10 rows
Read 10 rows
Read 2 rows

We can verify that each chunk has the correct header, and the columns all have consistent classes. There could be problems with factors and inconsistent levels across chunks, especially when reading small chunks; maybe the argument stringsAsFactors=FALSE is appropriate?

Community
  • 1
  • 1
Martin Morgan
  • 45,935
  • 7
  • 84
  • 112
1
for (i in seq(1, 100, by=10)) {
  cat(i, "\n")
  dat <- read.csv("yourfile.csv", skip = i-1, nrows = 10)
  print(dat)
}
Ven Yao
  • 3,680
  • 2
  • 27
  • 42