Two qualifications: 1 this is years after the question was asked; 2 this only works for replacing the last line. Despite point 2, I think it could be modified to correct for the specific line modification other than the last line.
Rather than using read.table and write.table, which take time with large arrays, readLines and writeLines appears to be more efficient. In the following example I remove the last line of a large array and replace it with new text.
Set up the example by creating a large array and saving as a file:
write.table(
array(runif(1000000),dim=c(1000,1000)),
file="BigArray.r", row.names = FALSE, col.names = FALSE, sep = "\t")
Open the big array file using readLines, remove the last line and then write it again. Separately, use writeLines to add a new final line:
time=proc.time()
BigArray=readLines("BigArray.r")
BigArray=BigArray[-length(BigArray)]
writeLines(BigArray,"BigArray.r",sep="\n")
write(seq(1,1000,1),ncolumns=1000,file="BigArray.r",append=TRUE,sep="\t")
proc.time()-time
user system elapsed
0.69 0.10 0.85
This performs better than the alternative:
time=proc.time()
BigArray=read.table("BigArray.r", sep = "\t")
BigArray[1000,]=seq(1,1000,1)
write.table(BigArray,file="BigArray.r", row.names = FALSE, col.names = FALSE,
sep =
"\t")
proc.time()-time
user system elapsed
3.62 0.11 3.75
Somebody may be able to do a better job of replacing a specific line within the middle of the array, but I can't get the new line for insertion into the same text format that readLines converts into.