I have seen plenty of questions regarding writing to file, but I am wondering what is the most robust way to open a text file, append some data and then close it again when you are going to be writing from many connections (i.e. in a parallel computing situation), and can't guarantee when each connection will want to write to the file.
For instance in the following toy example, which uses just the cores on my desktop, it seems to work ok, but I am wondering if this method will be prone to failure if the writes get longer and the number of processes writing to the file increases (especially across a network share where there may be some latency).
Can anyone suggest a robust, definitive way that connections should be opened, written to and then closed when there may be other slave processes that want to write to the file at the same time?
require(doParallel)
require(doRNG)
ncores <- 7
cl <- makeCluster( ncores , outfile = "" )
registerDoParallel( cl )
res <- foreach( j = 1:100 , .verbose = TRUE , .inorder= FALSE ) %dorng%{
d <- matrix( rnorm( 1e3 , j ) , nrow = 1 )
conn <- file( "~/output.txt" , open = "a" )
write.table( d , conn , append = TRUE , col.names = FALSE )
close( conn )
}
I am looking for the best way to do this, or if there is even a best way. Perhaps R and foreach
take care of what I would call writelock issues automagically?
Thanks.