2

In R, I often times need to do the same operation for a group of variables.

As an example, on my environment I currently have the following of data frames:

df_model1
df_model2
df_model3
df_model4
df_model5

and I have another data frame called df_obs.

What I need to do is to merge each of the df-model* data frame to df_obs.

What I usually do is something like this:

new_df_model1 <- merge(df_obs, df_model1, all=TRUE)
new_df_model2 <- merge(df_obs, df_model2, all=TRUE)
...

and so on, which is clearly not very practical.

How can I make this operation more programmatic?

thiagoveloso
  • 2,537
  • 3
  • 28
  • 57
  • 2
    Instead of merging them you can also combine them in a list `list(df_model1, df_model2, df_model3, df_model4, dfmodel5)` and work with `lapply` to perform the same function on each list element/model – Rentrop May 16 '15 at 10:32
  • @Floo0, what would the lapply command look like in this case? – thiagoveloso May 16 '15 at 10:58

2 Answers2

3

You could use Map to merge df_obs and the df_model datasets in a list.

 lst <- Map(`merge`, list(df_obs), 
          mget(paste0('df_model', 1:5)), MoreArgs=list(all=TRUE))

If the output datasets in the list needs to be separate data.frame objects in the global environment, we can use list2env (but I would prefer to keep it in the list as most of the operations can be done within the list)

 names(lst) <- paste('new',paste0('df_model', 1:5), sep="_")
 list2env(lst, envir= .GlobalEnv)

Or using lapply

 lapply(mget(paste0('df_model', 1:5)), merge, x = df_obs, all=TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This is really close to what I wish, but the output I need are new variables `new_df_model1`, `new_df_model2` and so on, rather than a list. Doable? – thiagoveloso May 16 '15 at 10:45
  • 2
    @thiagoveloso That is easy, but I would prefer to keep it in a list without having a lot of variables floating in the global environment – akrun May 16 '15 at 10:46
  • 4
    @thiagoveloso If an expert user like akrun puts all the output in a `list` object there is a reason. It's very bad practice to have different objects polluting your global environment, when you could put them in a list, simplifying in the meantime the process of calling same function on each of them. Learn to use `list`s and you won't regret. – nicola May 16 '15 at 10:55
  • @nicola I am sure there are many advantages on this, but the output variables will be used to generate taylor diagrams. For this purpose, it would me more complicated If they were a list. – thiagoveloso May 16 '15 at 10:58
  • 1
    I can't see a single instance in which what you say is true, in all honesty. It's much more complicated to keep them separated, no matter what you are trying to achieve. – nicola May 16 '15 at 11:01
1

You could use a for loop

for(i in 1:5) {
  data <- paste("new_df_model", i, sep = "")
  model <- merge(df_obs, paste("df_model", i, sep = ""), all = TRUE)
  assign(data, model)
}
dsifford
  • 2,809
  • 2
  • 15
  • 16