0

See below for my reprex of my issues with source, <-, <<-, environments, etc. There's 3 files, testrun.R, which calls inputs.R and CODE.R.

    # testrun.R (file 1)

    today <<- "abcdef"

    source("inputs.R")

    for (DC in c("a", "b")) {
      usedlater_3 <- paste("X", DC, used_later2)
      print(usedlater_3)
      source("CODE.R", local = TRUE)
    }

    final_output <- paste(OD_output, used_later2, usedlater_3)
    print(final_output)



    # #---- file 2
    # # inputs.R
    # used_later1 <- paste(today, "_later")
    # used_later2 <- "l2"
    # 
    # #---- file 3
    # # CODE.R
    # OD_output <- paste(DC, today, used_later1, usedlater_2, usedlater_3)

I'm afraid I didn't learn R or CS in a proper way so I'm trying to catch up now. Any bigger picture lessons would be helpful. Previously, I've been relying on a global environment where I keep everything (and save/keep between sessions), but now I'm trying to make everything reproducible, so I'm using RStudio to run local jobs that start from scratch.

I've been trying different combinations of <-, <<-, and source(local = TRUE) (instead of local = FALSE). I do use functions for pieces of code where I know the inputs I need and outputs I want, but as you can see, CODE.R uses variables from both testrun.R, the loop inside testrun.R, and input.R. Converting some of the code into functions might help ? but I'd like to know of alternatives as well given this case.

Finally you can see my own troubleshooting log to see my thought process:

  • first run: variable today wasn't found, so I made today <<- "abcdef" double arrow assignment
  • second run: DC not found, so I will switch to local = TRUE
  • third run: but now usedlater_2 not found, so i will change usedlater_2 to <<-. (what about usedlater_1? why didn't this show up as error? we'll see...)
  • result of third run: usedlater_2 still not found when CODE.R needs it. out of ideas. note: used_later2 was found to create used_later3 in the for loop in testrun.R.
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Arthur Yip
  • 5,810
  • 2
  • 31
  • 50
  • do I need to use assign? globalenv? parent.env? switch everything to functions instead of source? – Arthur Yip Aug 12 '19 at 00:40
  • 2
    *In general*, the use of `<<-` (assign to some "higher" environment) indicates a problem (with the function or workflow). There are certainly exceptions, but many consider it better practice (when writing a function) to never look for objects outside of its function definition (using *arguments* for things are are needed) and more the point, never change anything in other environments (which is "side-effect", something that greatly complicates troubleshooting and reproducibility). – r2evans Aug 12 '19 at 01:02
  • 1
    I'm inferring that you are running one script after another, in a particular order, and each script does something to completion and leaves behind "artifacts" on which future script-files rely. This sounds like each script-file should really be a function where the "artifacts" that are needed should be passed as arguments to the function. – r2evans Aug 12 '19 at 01:05
  • Thanks. Seems like I was getting away with sourcing subroutines in R scripts because of R's lexical scoping and reading in global variables / variables in the parent environment. Will try to convert most of my subroutines into functions and variables/parameters into lists. – Arthur Yip Aug 12 '19 at 03:14
  • Actually I'm confused - lexical scoping applies to functions (which I have been using), but when I use source("CODE.R", local = FALSE), does the code grab variables from the parent environment or global environment? – Arthur Yip Aug 12 '19 at 03:38
  • I also tried to explain what I did, why I did it that way (writing and sourcing subroutine scripts), and why it seems like writing functions is the right way to go: https://stackoverflow.com/a/57455236/4663008 – Arthur Yip Aug 12 '19 at 03:43

0 Answers0