0

I would like to read a large .csv into R. It'd handy to split it into various objects and treat them separately. I managed to do this with a while loop, assigning each tenth to an object:

# The dataset is larger, numbers are fictitious

n <- 0

while(n < 10000){
  a <- paste('a_', n, sep = '')
  assign(a, read.csv('df.csv', 
                      header = F, stringsAsFactors = F, nrows = 1000, skip = 0 + n)))
  # There will be some additional processing here (omitted) 
  n <- n + 1000
}

Is there a more R-like way of doing this? I immediately thought of lapply. According to my understanding each object would be the element of a list that I would then have to unlist. I gave a shot to the following but it didn't work and my list only has one element:

A <- lapply('df.csv', read.csv, 
             header = F, stringsAsFactors = F, nrows = 1000, skip = seq(0, 10000, 1000))

What am I missing? How do I proceed from here? How do I then unlist A and specify each element of the list as a separate data.frame?

Jasper
  • 133
  • 1
  • 2
  • 8
  • 1
    `library("fortunes"); fortune(236)` https://stackoverflow.com/questions/17559390/why-is-using-assign-bad – jogo May 30 '17 at 14:06

1 Answers1

0

If you apply lapply to a single element you'll have only one element as an output.

You probably want to do this:

a <- paste0('a_', 1:1000) # all your 'a's

A <- lapply(a,function(x){
  read.csv('df.csv', header = F, stringsAsFactors = F, nrows = 1000, skip = 0 + n)
})

for each element of a, called x because it's the name I chose as my function parameter, I execute your command. A will be a list of the results.

Edit: As @Val mentions in comments, assign seems not needed here, so I removed it, you'll end up with a list of data.frames coming from your csvs if all works fine.

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167