I would like to remove all lines from a file which start with a certain pattern. I would like to do this with R. It is good practice to not first read the whole file, then remove all matching lines and afterwards write the whole file, as the file can be huge. I am thus wondering if I can have both a read and a write connection (open all the time, one at a time?) to the same file. The following shows the idea (but 'hangs' and thus fails).
## Create an example file
fnm <- "foo.txt" # file name
sink(fnm)
cat("Hello\n## ----\nworld\n")
sink()
## Read the file 'fnm' one line at a time and write it back to 'fnm'
## if it does *not* contain the pattern 'pat'
pat <- "## ----" # pattern
while(TRUE) {
rcon <- file(fnm, "r") # read connection
line <- readLines(rcon, n = 1) # read one line
close(rcon)
if(length(line) == 0) { # end of file
break
} else {
if(!grepl(pat, line)) {
wcon <- file(fnm, "w")
writeLines(line, con = wcon)
close(wcon)
}
}
}
Note:
1) See here for an answer if one writes to a new file. One could then delete the old file and rename the new one to the old one, but that does not seem very elegant :-).
2) Update: The following MWE produces
Hello
world
-
world
See:
## Create an example file
fnm <- "foo.txt" # file name
sink(fnm)
cat("Hello\n## ----\nworld\n")
sink()
## Read the file 'fnm' one line at a time and write it back to 'fnm'
## if it does *not* contain the pattern 'pat'
pat <- "## ----" # pattern
con <- file(fnm, "r+") # read and write connection
while(TRUE) {
line <- readLines(con, n = 1L) # read one line
if(length(line) == 0) break # end of file
if(!grepl(pat, line))
writeLines(line, con = con)
}
close(con)