3

Question:

I'm using sys.source to source a script's output into a new environment. However, that script itself source()'s some things as well.

When it sources functions, they (and their output) get loaded into R_GlobalEnv instead of into the environment specified by sys.source(). It seems the functions enclosing and binding environments end up being under R_GlobalEnv instead of what you specify in sys.source().

Is there a way like sys.source() to run a script and keep everything it makes in a separate environment? An ideal solution would not require modifying the scripts I'm sourcing and still have "chdir = TRUE" style functionality.

Example:

Running this should show you what I mean:

# setup an external folder
other.folder = tempdir()

# make a functions script, it just adds "1" to the argument.
# Note: the strange-looking "assign(x=" bit is important 
# to what I'm actually doing, so any solution needs to be
# robust to this.
functions = file.path(other.folder, "functions.R")
writeLines("myfunction = function(a){assign(x=c('function.output'), a+1, pos = 1)}", functions)

# make a parent script, which source()'s functions.R
# and invokes it on some data, and then modifies that data
parent = file.path(other.folder, "parent.R") 
writeLines("source('functions.R')\n
           original.data=1\n
           myfunction(original.data)\n
           resulting.data = function.output + 1", parent)

# make a separate environment
myenv = new.env()

# source parent.R into that new environment,
# using chdir=TRUE so parent.R can find functions.R
sys.source(parent, myenv, chdir = TRUE)

# You can see "myfunction" and "function.output" 
# end up in R_GlobalEnv.

# Whereas "original.data" and "resulting.data" end up in the intended environment.
ls(myenv) 

More information (what I'm actually trying to do):

I have data from several similar experiments. I'm trying to keep everything in line with "reproducible research" ideals (for my own sanity if nothing else). So what I'm doing is keeping each experiment in its own folder. The folder contains the raw data, and all the metadata which describes each sample (treatment, genotype, etc.). The folder also contains the necessary R scripts to read the raw data, match it with metadata, process it, and output graphs and summary statistics. These are tied into a "mother script" which will do the whole process for each experiment.

This works really well but if I want to do some meta-analysis or just compare results between experiments there are some difficulties. Right now I am thinking the best way would be to run each experiment's "mother script" in its own environment, and then pull out the data from each environment to do my meta-analysis. An alternative approach might be running each mother script in its own instance and then saving the .RData files separately and then re-loading them into a new environment in a new instance. This seems kinda hacky though and I feel like there's a more elegant solution.

ccoffman
  • 431
  • 4
  • 9
  • I think you'll have to modify the scripts you're sourcing so that they all assign into explicit environments. – Thomas Sep 01 '15 at 01:15
  • 1
    Try something like `evalq(source(parent), myenv)` – Hong Ooi Sep 01 '15 at 02:27
  • The help page for sys.source has an example that uses: `env <- attach(NULL, name = "myenv")`. Have your tried that? – IRTFM Sep 01 '15 at 02:35
  • @HongOoi Just tried that, both with source() and sys.source() (using chdir = TRUE) and it didn't end up putting anything in the environment, everything ended up in the global environment. – ccoffman Sep 01 '15 at 08:58
  • @BondedDust I did try that, but to no avail. It just does the same thing. – ccoffman Sep 01 '15 at 08:59
  • 1
    If you are trying to defend against source files you can't edit which might be assigning directly to position 1, then I think you are going to be limited to copying out your environment, moving the new entries, and restoring the old. If you can edit, just don' t assign to position 1 and set the local argument of `source` to `TRUE`. – A. Webb Sep 01 '15 at 16:27
  • So you mean, `source()` a file, copy everything into a new environment, clear the global, and then `source()` the next one? So elegant! Thanks! – ccoffman Feb 05 '16 at 13:31
  • With respect to using sys.source you may want to specify top level environment as shown [here](https://stackoverflow.com/a/50158955/1655567). – Konrad May 03 '18 at 16:14

0 Answers0