0

i'm trying to read csv files into data frames in R. I've already managed to cycle through defined folders, read the csv files and assign them in order to create a dataframe with the name. However, i cant seem to append data if a dataframe already exists. If a dataframe already exists i want to append the new data on the bottom and not just replace the whole thing.

This is what i have working so far:

fileName <- list.files("\\path\\subfolder", "*csv", full.names = FALSE)
fileName <- gsub(".csv", "", fileName)

for (i in 1:length(testPath)) {
tempVar <-  read.csv(testPath[i])
assign(fileName[i], tempVar)
}

This only cycles through one folder, I know how to make it cycle through multiple folders. However when I run this code twice it will not append data to the dataframes, instead just create them again from a csv

Thanks for the help!

UPDATE: I FIGURED IT OUT SEE BELOW

robs
  • 649
  • 4
  • 13
  • 28
  • 1
    Lots of options at this very well travelled duplicate (though you should avoid any solution using `assign`): https://stackoverflow.com/q/11433432/324364 – joran Aug 03 '18 at 19:13
  • Hi, this post is the one i used as base for my code above. However, it doesnt append any data to existing data frames when looping through differnet folders. Also tried the update approach using list2env(lapply())), if i run the code multiple times i cant get the data to append – robs Aug 03 '18 at 19:23
  • The solutions there aren't organized in the clearest fashion, but many of them use `do.call(rbind,...)` or `dplyr::bind_rows` or the equivalent from data.table to do what you want. It's all there, you just have to read carefully and slowly, since, as I said, it's not organized as cleanly as it could. – joran Aug 03 '18 at 19:26
  • There, I edited the first answer there, which is probably the only one you really ventured into. Hopefully it points you in a more helpful direction now. – joran Aug 03 '18 at 19:34
  • @robs: this might help too https://stackoverflow.com/a/48105838/786542 – Tung Aug 03 '18 at 20:40
  • Just figured it out posting the answer – robs Aug 03 '18 at 20:50

1 Answers1

0

I just figured it out. I'm pretty new to R so I did a workaround creating a temporary dataframe if one exists and then merging it with the old one. In case any one is looking for the code in the future, heres what i did.

Its not pretty but it works:

mainFilesOutputFolder <- paste("\\\\network\\folder\\", sep="")
scanSubFolders <- c("subfolder1", "subfolder2", "subfolder3")

for (k in 1:length(scanSubFolders)) {
  csvOutputs <- list.files(paste(mainFilesOutputFolder, scanSubFolders[k], "\\", sep=""), "*csv", full.names = TRUE)

  fileName <- list.files(paste(mainFilesOutputFolder, scanSubFolders[k], "\\", sep=""), "*csv", full.names = FALSE)
  fileName <- gsub(".csv", "", fileName)

  for (i in 1:length(csvOutputs)) {
    df_name <- fileName[i]
    if (exists(df_name)) {
      mergedDF <- rbind(get(fileName[i]),read.csv(csvOutputs[i]))
      assign(fileName[i], mergedDF)
      }else{
        #false
        assign(fileName[i], read.csv(csvOutputs[i]))
      } 
  }
}

Hope this helps someone!

robs
  • 649
  • 4
  • 13
  • 28
  • Note that you're growing object `mergedDF` inside a loop. This is very memory inefficient and not recommended in R. See these great posts: [Efficient accumulation in R](http://winvector.github.io/Accumulation/) & [Applying a function over rows of a data frame](https://rpubs.com/wch/200398) – Tung Aug 04 '18 at 11:00