35

A post on here a day back has me wondering how to assign values to multiple objects in the global environment from within a function. This is my attempt using lapply (assign may be safer than <<- but I have never actually used it and am not familiar with it).

#fake data set
df <- data.frame(
  x.2=rnorm(25),
  y.2=rnorm(25),
  g=rep(factor(LETTERS[1:5]), 5)
)

#split it into a list of data frames
LIST <- split(df, df$g)

#pre-allot 5 objects in R with class data.frame()
V <- W <- X <- Y <- Z <- data.frame()

#attempt to assign the data frames in the LIST to the objects just created
lapply(seq_along(LIST), function(x) c(V, W, X, Y, Z)[x] <<- LIST[[x]])

Please feel free to shorten any/all parts of my code to make this work (or work better/faster).

smci
  • 32,567
  • 20
  • 113
  • 146
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 7
    This question or any answer to it should come with a big "Children, don't do this at home!" disclaimer. As you might know, global assignments within functions are a recipe for disaster, or "life by a volcano" to quote Richard Burns (http://www.burns-stat.com/pages/Tutor/R_inferno.pdf) – flodel Mar 16 '12 at 01:58
  • @flodel I'm no programmer so can you briefly explain what the issue with doing assign is? – Tyler Rinker Mar 16 '12 at 02:26
  • So I read the section you quoted. That seems intelligent for code for public consumption but not for personal code. Can you see a way to achieve this affect without? – Tyler Rinker Mar 16 '12 at 02:30
  • 1
    I'll quote Wikipedia on global variables: *They are usually considered bad practice precisely because of their non-locality: a global variable can potentially be modified from anywhere (unless they reside in protected memory or are otherwise rendered read-only), and any part of the program may depend on it.[1] A global variable therefore has an unlimited potential for creating mutual dependencies, and adding mutual dependencies increases complexity.* – flodel Mar 16 '12 at 02:35
  • 1
    Ok... With a better look into your particular situation, you do not seem to wander too far from your global environment, so the risk for "mutual dependencies" is very small here, and Josh's answer is perfectly fine. Still, I hope my warning can help other people who might be tempted to "Assign multiple objects to .GlobalEnv from within a function" (your question title) but in a more intricate context. – flodel Mar 16 '12 at 02:57
  • I think part of the objections you're seeing depend on whether we're assigning 5 empty dataframes, or 5 aliases to in future potentially the same dataframe(s), or some subset of records in it (which would be a major code smell that you should be passing around vectors of indices, or indeed using a different data-structure package). Global variables is bad enough, but many global dataframes sounds like the path to madness. – smci Apr 28 '16 at 13:32
  • 1
    I gotta say, when it comes to complex/dirty data, developing functions with heavy UI (select.list(), locator(), indentify(), getGraphicsEvent() ), for a group a people who do not program, this has been a great way to bring them into the program without have to train them in programing. – MadmanLee Feb 03 '18 at 02:34

4 Answers4

54

Update of 2018-10-10:

The most succinct way to carry out this specific task is to use list2env() like so:

## Create an example list of five data.frames
df <- data.frame(x = rnorm(25),
                 g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)

## Assign them to the global environment
list2env(LIST, envir = .GlobalEnv)

## Check that it worked
ls()
## [1] "A"    "B"    "C"    "D"    "df"   "E"    "LIST"

Original answer, demonstrating use of assign()

You're right that assign() is the right tool for the job. Its envir argument gives you precise control over where assignment takes place -- control that is not available with either <- or <<-.

So, for example, to assign the value of X to an object named NAME in the the global environment, you would do:

assign("NAME", X, envir = .GlobalEnv)

In your case:

df <- data.frame(x = rnorm(25),
                 g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
NAMES <- c("V", "W", "X", "Y", "Z")

lapply(seq_along(LIST), 
       function(x) {
           assign(NAMES[x], LIST[[x]], envir=.GlobalEnv)
        }
)

ls()
[1] "df"    "LIST"  "NAMES" "V"     "W"     "X"     "Y"     "Z"    
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • Thanks. I knew `<<-` is bad form but have been lazy. I'll get into the habit of using `assign`. I actually had exactly what you do at one point but thought I shouldn't put quotation marks around the Objects as I had already pre allotted them. This is easier any how. – Tyler Rinker Mar 15 '12 at 19:41
  • Yeah. Using `assign()` is one of those things that looks like a hump to get over before you try it out, but once you do, you wonder why you ever had any hesitation in the first place. – Josh O'Brien Mar 15 '12 at 19:54
  • +1 - gold. I had to assign all objects in an environment into the global environment just now, and this solution saved me having to think it out. – ricardo Jan 31 '14 at 05:13
  • When should I use envir=.GlobalEnv and when envir=parent.frame()? – skan Nov 26 '18 at 14:13
1

I think this question can have a nice crossover with this one: Can lists be created that name themselves based on input object names?

Say you want to do the same modification to a set of objects on the fly. But list2env() requires a named list, and you don't want to copy and paste them again. Borrowing the namedList function, and combining it with Josh O'Brien anwser:

> namedList <- function(...) {
+   L <- list(...)
+   snm <- sapply(substitute(list(...)), deparse)[-1]
+   if (is.null(nm <- names(L))) nm <- snm
+   if (any(nonames <- nm=="")) nm[nonames] <- snm[nonames]
+   setNames(L ,nm)
+ }
> 
> df_1 <- data.frame(x = 1)
> df_2 <- data.frame(x = 2)
> df_3 <- data.frame(x = 3)
> 
> list2env(lapply(namedList(df_1, df_2, df_3), function(x) {
+   x <- cbind.data.frame(x, y = "B")
+ }), envir = .GlobalEnv)
<environment: R_GlobalEnv>
> 
> df_1
  x y
1 1 B
> df_2
  x y
1 2 B
> df_3
  x y
1 3 B
0

If you have a list of object names and file paths you can also use mapply:

object_names <- c("df_1", "df_2", "df_3")
file_paths   <- list.files({path}, pattern = ".csv", full.names = T)
    
mapply(function(df_name, file) 
           assign(df_name, read.csv(file), envir=.GlobalEnv),
       object_names,
       file_paths)
  • I used list.files() to construct a vector of all the .csv files in a specific directory. But file_paths could be written or constructed in any way.
  • If the files you want to read in are in the current working directory, then file_paths could be replaced with a character vector of file names.
  • In the code above, you need to replace {path} with a string of the desired directory's path.
Danielle
  • 733
  • 1
  • 10
  • 24
0

This demonstrates how to split out a nested dataframe into objects in the global environment with tidyverse functions:

library(tidyverse)
library(palmerpenguins)

penguins %>% 
  group_nest(species) %>% 
  deframe() %>% 
  list2env(.GlobalEnv)
Conor
  • 131
  • 5