0

I have several .RData files, each of which has letters and numbers in its name, eg. m22.RData. Each of these contains a single data.frame object, with the same name as the file, eg. m22.RData contains a data.frame object named "m22".

I can generate the file names easily enough with something like datanames <- paste0(c("m","n"),seq(1,100)) and then use load() on those, which will leave me with a few hundred data.frame objects named m1, m2, etc. What I am not sure of is how to do the next step -- prepare and merge each of these dataframes without having to type out all their names.

I can make a function that accepts a data frame as input and does all the processing. But if I pass it datanames[22] as input, I am passing it the string "m22", not the data frame object named m22.

My end goal is to epeatedly do the same steps on a bunch of different data frames without manually typing out "prepdata(m1) prepdata(m2) ... prepdata(n100)". I can think of two ways to do it, but I don't know how to implement either of them:

  • Get from a vector of the names of the data frames to a list containing the actual data frames.
  • Modify my "prepdata" function so that it can accept the name of the data frame, but then still somehow be able to do things to the data frame itself (possibly by way of "assign"? But the last step of the function will be to merge the prepared data to a bigger data frame, and I'm not sure if there's a method that uses "assign" that can do that...)

Can anybody advise on how to implement either of the above methods, or another way to make this work?

bluemouse
  • 360
  • 1
  • 11

2 Answers2

1

Assuming all your data exists in the same folder you can create an R object with all the paths, then you can create a function that gets a path to a Rdata file, reads it and calls "prepdata". Finally, using the purr package you can apply the same function on a input vector.

Something like this should work:

library(purrr)
rdata_paths <- list.files(path = "path/to/your/files", full.names = TRUE)

read_rdata <- function(path) {
  data <- load(path)

  return(data)
}

prepdata <- function(data) { 
  ### your prepdata implementation
} 

master_function <- function(path) {
  data <- read_rdata(path)
  result <- prepdata(data)

  return(result)

}

merged_rdatas <- map_df(rdata_paths, master_function) # This create one dataset. Merging all together
luizgg
  • 333
  • 3
  • 13
1

See this answer and the corresponding R FAQ

Basically:

temp1 <- c(1,2,3)
save(temp1, file = "temp1.RData")
x <- c()
x[1] <- load("temp1.RData")
get(x[1])

#> [1] 1 2 3

kkeey
  • 349
  • 3
  • 10
  • Thanks! Did not realize that the result of "load" could be assigned to a name of my choosing or a place in a list -- that makes this all MUCH easier! – bluemouse Nov 15 '19 at 00:00
  • 1
    To correct my other comment in case anyone searching finds this later, the magic isn't in load (which is returning the *name* of the object) it's in get (which takes the name and returns the data frame). Eg. if I wanted to name my loaded data frame "chunk" I'd do varname <- load("m100.RData") then chunk <- get(varname). – bluemouse Nov 15 '19 at 03:20