1

I have a folder with many txt. files and want to import them in one dataframe. I noticed that after 700-800 files, the loop starts working very slow. Is there is a way to optimize the loop or a function that gives the same output faster?

DF<-data.frame()

for (i in 1:20000){
newDF<-t
newDF<- tryCatch(read.table(paste0("PATH",i,".txt"), header=FALSE, sep = ","), error=function(e) NULL)

DF<-rbind(DF,newDF)
}
Priit Mets
  • 465
  • 2
  • 14
  • 1
    If you have a moment, read the [R Inferno](https://www.burns-stat.com/pages/Tutor/R_inferno.pdf) chapter 2, *Growing Objects*. Regardless, iteratively adding rows to a frame is going to be slow: each time you add even one row, all of the existing rows are copied, existing in memory twice. Doing this a few times is fine, but if you do it over and over, each time the frame's size increases, the copy takes just a little bit longer. – r2evans Apr 27 '21 at 12:20
  • 3
    There are several techniques for doing it different (see https://stackoverflow.com/a/24376207/3358227), but the bottom line is to make a list of frames (e.g., `LOF`) and then do the `rbind` all at once, choosing one of `do.call(rbind, LOF)`, `dplyr::bind_rows(LOF)`, or `data.table::rbindlist(LOF)`. One way to make this is `LOF <- lapply(1:20000, function(i) read.table(...))` (using `tryCatch` internally as necessary). – r2evans Apr 27 '21 at 12:22

0 Answers0