I've done what you described and split up chunks of code in separate R files and have been running source(this) and source(that), but I've been painfully learning that sourcing functions (rather than subroutines/script files) is the better way to go.
Here are 3 possible reasons why we might have developed their scripts in this way and stuck to it, and 3 reasons why switching to functions makes sense:
- We wanted to debug directly when a script went wrong (be able to track all variables and their status in the single global environment).
- I've now realized that RStudio's debugger / traceback is a much better way to do true debugging.
2a) We didn't know what variables needed to be kept for later (didn't want to keep track of which variables to put into functions and which variables to output from functions)
- Functions help force us to be explicit about what gets used in one part of a script and what doesn't, and what is essential to keep from a part of a script, since it's unnecessary to output every part of it. Variables are better kept in only the environments they are needed, rather than everything passed in and out of the global environment.
- Also, I think environments can be act as lists, so I think it's possible to throw a whole environment into functions and out?? Need to do more reading/learning about this.
2b) We have a large number of variables for everything (parameters/variables, settings, different parts of data) so it's impractical to stuff everything in and out of functions.
- With structures like lists, we can lump categories of variables together and send them into functions. Functions can also return lists (rather than variables).
Related SO Q&A:
Comments from others welcome!