6

I have many many .RData files containing one dataframe that I had saved in a previous analysis and the data frame has the same name for each file loaded. So for example using load(file1.RData) I get a data frame called 'df', then using load(file2.RData) I get a data frame with the same name 'df'. I was wondering if it is at all possible to combine all these .RData files into one big .RData file since I need to load them all at once, with the name of each df equal to the file name so I can then use the different data frames.

I can do this using the code below, but it is very intricate, there must be a simpler way to do this… Thank you for your suggestions.

Say I have 3 .RData files and want to save all in a file called "main.RData" with their specific name (now they all come out as 'df'):

all.files = c("/Users/fra/file1.RData", "/Users/fra/file2.RData", "/Users/fra/file3.RData")
assign(gsub("/Users/fra/", "", all.files[1]), local(get(load(all.files[1]))))
rm(list= ls()[!(ls() %in% (ls(pattern = "file")))])
save.image(file="main.RData")


all.files = all.files = c("/Users/fra/file1.RData", "/Users/fra/file2.RData", "/Users/fra/file3.RData")

for (f in all.files[-1]) {
  assign(gsub("/Users/fra/", "", f), local(get(load(f))))
  rm(list= ls()[!(ls() %in% (ls(pattern = "file")))])
  save.image(file="main.RData")
}
user971102
  • 3,005
  • 4
  • 30
  • 37

2 Answers2

5

Here's an option that incorporates several existing posts

all.files = c("file1.RData", "file2.RData", "file3.RData")

Read multiple dataframes into a single named list (How can I load an object into a variable name that I specify from an R data file?)

mylist<- lapply(all.files, function(x) {
  load(file = x)
  get(ls()[ls()!= "filename"])
})

names(mylist) <- all.files #Note, the names here don't have to match the filenames

You can save the list, or transfer the dataframes into the global environment prior to saving (Unlist a list of dataframes)

list2env(mylist ,.GlobalEnv)

Alternatively, if the dataframes were identical and you wanted to create a single big dataframe, you could collapse the list and add a variable with names of contributing files (Dataframes in a list; adding a new variable with name of dataframe).

all <- do.call("rbind", mylist)
all$id <- rep(all.files, sapply(mylist, nrow))
Community
  • 1
  • 1
JWilliman
  • 3,558
  • 32
  • 36
4

I think the best answer I saw was the code below, which I copied from an SO answer which I can't track down right now. Apologies to the original author.

resave <- function(..., list = character(), file) {
   previous  <- load(file)
   var.names <- c(list, as.character(substitute(list(...)))[-1L])
   for (var in var.names) assign(var, get(var, envir = parent.frame()))
   save(list = unique(c(previous, var.names)), file = file)
}
#I took advantage of the fact the load function 
#returns the name of the loaded variables, so 
#I could use the function's environment instead of creating one.
#And when using get, I was careful to only look in the 
#environment from which the function is called, i.e. parent.frame()
Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
  • Thank you Carl. I guess this works by specifying one initial saved .RData file and adding on to this file other R objects, but I think to apply it to my case I would need to apply this function to each of the .RData files in turn, each time loading an .RData file, changing the names of the objects so that I know which file the data frames came from (all of my .RData files load a data frame with the same exact name), and "appending" the data frame to the saved .RData file. Am I understanding this correctly? Thanks again! – user971102 Feb 07 '13 at 18:47
  • Ok maybe I am almost there thanks to your function (see above)…Just the naming part is not right I think… Thanks again! – user971102 Feb 07 '13 at 19:28
  • Sounds like you're on the right track. If by any chance your data objects are revisions of each other, i.e. lots of values in common, you might think about storing just the differences (sort of like SVN systems) – Carl Witthoft Feb 07 '13 at 19:40