I have a working loop to read the text files (unknown names) from different folders (known location) and update those text files columns and saving again with same name in same folder
folders <- c(1,2,3)
for(i in seq_along(folders)){
dt <- df[(df$id ==folders[i]),]
dt$id <- NULL
loc <- paste0("data/", folders[i])
setwd(loc)
file.names <- list.files(pattern = "*.txt$", all.files = FALSE,
full.names = FALSE, recursive = FALSE,
ignore.case = FALSE)
for(j in seq_along(file.names)){
text <- read.csv(file.names[j], header = F, stringsAsFactors = F)
text2 <- merge(text, dt, by.x = "matched", by.y = "matched", all.x = T)
write.table(text2, file.names[j], sep = ",", na="",
row.names = FALSE, quote = TRUE, col.names = F)
rm(text,text2)
print(j)
}
}
There are two problems i'm facing, first one its very slow, second one it uses too much ram/memory. Tried to do myself but don't know much about R. It is possible to increase the speed by creating some functions and "if simply initialize a vector (with NAs, zeros or any other value) with the total length and then run the loop, we can drastically increase the speed of our algorithm". I wish I could do something like that myself.