How to open a file and then write lines to it in a loop

Question

There are quite a few R posts with a similar topic but they don't provide what I'm looking for.

Psuedo code (this is NOT meant to be R) for what I want is as as follows:

fileConn <- file("foo.txt")
for (i in 1:hiLimit) {
  # extract elements from each nested and variable json element in an R list.
  # paste elements into a comma separated list
  write(pastedStuff,fileConn)
}
close(fileConn)

Now if I skip the 'file' and 'close' and just use 'cat' with a filename and 'append=TRUE' as follows:

cat(paste(cve,vndr,pnm,vnmbr,vaffct,sep=","),file="outfile.txt",append=TRUE,sep="\n")

I get what I want. But, presumably, this is opening and closing the file for each call (??? assumption). Avoiding that should make it faster.

What I have not been able to work out is how to achieve the result via the method in the psuedo code which only opens and closes the file once. Using 'cat' or 'writeLines' just gives me the last line in the file.

By way of explanation, the problem I'm working on involves building a dataframe from scratch row by row. My timings (see below) indicate that by far the fastest way I can do this is to write a csv to disk and then read it back in to create the dataframe. This is crazy but that's the way it's panning out.

## Just the loop without any attempt to collect parsed data into a dataframe
system.time(tmp <- affectsDetails(CVEbase,Affect))
   user  system elapsed 
   0.30    0.00    0.29 

## Using rbind as in rslt<- rbind (rslt,c(stuff)) to build dataframe in the loop.
system.time(tmp <- affectsDetails(CVEbase,Affect))
   user  system elapsed 
 990.46    2.94  994.01 

# Preallocate and insert list as per 
# https://stackoverflow.com/questions/3642535/creating-an-r-dataframe-row-by-row
system.time(tmp <- affectsDetails(CVEbase,Affect))
   user  system elapsed 
1451.42    0.04 1452.37 

# Write to a file with cat and read back the csv.
system.time(tmp <- affectsDetails(CVEbase,Affect))
   user  system elapsed 
  10.70   29.00   45.42

Any suggestions appreciated!

score 1 · Accepted Answer · answered Jul 09 '19 at 05:39

Not sure how I can help you. But you can open a connection and keep it open until writing is finished.

testcon <- file(description = "C:/test.txt", open = "a")

isOpen(testcon)
[1] TRUE

start <- Sys.time()
for (i in 1:10000) {

 cat(paste0("hallo", i), file= testcon, append=TRUE,sep="\n")

}
end <- Sys.time()

end-start
Time difference of 0.2017999 secs

close(testcon)

Which seems to be considerably faster than:

start <- Sys.time()
for (i in 1:10000) {

  cat(paste0("hallo", i), file= "C:/test.txt", append=TRUE,sep="\n")

}
end <- Sys.time()

end-start
Time difference of 3.382569 secs

That seems to do it! Thanks so much. Same speedup observed, factor of 10 better. I suspect it is the ' open="a" ' that makes the difference. Figuring out why the in-memory performance is so bad will be a problem for another day. — A.Commons, Jul 09 '19 at 05:55

How to open a file and then write lines to it in a loop

1 Answers1