0

I have a file of size 2.5 GB and my RAM size is 4 GB. I do not know the number of records in the file. I want to break this file into 5 parts(chunk) of 500 MBs each ( approx. 2000000 records per part or chunk).

While doing this I have a problem that how to tell R that you have reached the last line , so please end the process and through the last file of whatever size it is of.

code I'm using is below:

f<- file(file.choose(),"r")     # file given here is of 2.5 GB       

for(i in 1:4){
  ab <-sprintf('d_%d',i)
  write.csv(read.table(f,nrows=1600000,blank.lines.skip = TRUE,),
            file=sprintf('%s.csv',ab))
}

Error is:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 1446670 did not have 10 elements

So how to tell my program that as and when it reaches the last record, stop process and through the last chunk of whatever size it is formed?

Utsav Bhargava
  • 121
  • 2
  • 11
  • Use the right tool for the job, split it outsite R (or within R, `system()`) then read in. See: [Unix: How to split a file into equal parts](http://stackoverflow.com/questions/7764755/unix-how-to-split-a-file-into-equal-parts-without-breaking-individual-lines) – zx8754 Jul 28 '15 at 08:52
  • By the way error indicates something is wrong with delimiters: `line 1446670 did not have 10 elements` – zx8754 Jul 28 '15 at 08:54
  • that record has some missing data (6 elements missing),but I do not want the process to end here.It should continue and ends only when last record is reached – Utsav Bhargava Jul 28 '15 at 10:51

0 Answers0