Combine and run multiple R scripts from another script

Question

Essentially, I have constructed a sizable predictive model in R with about 10~15 separate script files for collecting, sorting, analyzing and presenting my data. Rather than just put everything into one gigantic script file, I would like to maintain some level of modularity and run each piece from a control script, or some kind of comparable control mechanism, as I've done in matlab before. Is this possible in R?

I have read this thread as well as its related threads, but couldn't find this exact answer. Organizing R Source Code

Other than `source()`ing each script from a single central script, the answer really is to write a package. That's how it's done in R. — joran, Aug 12 '14 at 20:09
@joran I've been tinkering around with R packages for analytical projects but I never really saw a clear advantage/clean way to do it. What I consider useful is creating a entirely separate package for the functions and possible the data. I'm highly interested in learning about other people's approaches. — NoBackingDown, Jul 06 '17 at 08:48

Anders Ellern Bilgrau · Answer 1 · 2020-03-12T09:49:42.567

34

I think you're simply looking for the source function. See ?source. I often have a master script which source other .R files.

edited Mar 12 '20 at 09:49

answered Aug 12 '14 at 20:08

Anders Ellern Bilgrau

9,928
1
30
37

score 13 · Answer 2 · answered Oct 23 '18 at 08:53

I am a new developer and I am answering with an example that worked for me because no one has given an example. Example of using source("myscript.R"), to call another R script "myscript_A.R" or "myscript_B.R" is as follows-

if(condition==X){
    source("myscript_A.R")
}else{
    source("myscript_B.R")
}

Arthur Yip · Answer 3 · 2020-08-07T05:38:57.823

I've done what you described and split up chunks of code in separate R files and have been running source(this) and source(that), but I've been painfully learning that sourcing functions (rather than subroutines/script files) is the better way to go.

Here are 3 possible reasons why we might have developed their scripts in this way and stuck to it, and 3 reasons why switching to functions makes sense:

We wanted to debug directly when a script went wrong (be able to track all variables and their status in the single global environment).

I've now realized that RStudio's debugger / traceback is a much better way to do true debugging.

2a) We didn't know what variables needed to be kept for later (didn't want to keep track of which variables to put into functions and which variables to output from functions)

Functions help force us to be explicit about what gets used in one part of a script and what doesn't, and what is essential to keep from a part of a script, since it's unnecessary to output every part of it. Variables are better kept in only the environments they are needed, rather than everything passed in and out of the global environment.
Also, I think environments can be act as lists, so I think it's possible to throw a whole environment into functions and out?? Need to do more reading/learning about this.

2b) We have a large number of variables for everything (parameters/variables, settings, different parts of data) so it's impractical to stuff everything in and out of functions.

With structures like lists, we can lump categories of variables together and send them into functions. Functions can also return lists (rather than variables).

Related SO Q&A:

Comments from others welcome!

score 7 · Answer 4 · answered Mar 21 '21 at 17:44

7

You can source all .R scripts from a folder:

# Load tidyverse
library(tidyverse) # to pipe (%>%) and map across each file

# List files and source each
list.files("path_to_folder", full.names = TRUE) %>% map(source)

Here, you list all files from a folder, then you map across the source() function.

This solution is more useful if each script contains some functions, and you would then like to use these functions in a master_script.

answered Mar 21 '21 at 17:44

Will M

692
9
20

I was not able to implement "map" function: Error: 'map' is not an exported object from 'namespace:tidyverse' – Vojtěch Kania Jan 24 '23 at 13:57

score 1 · Answer 5 · answered Aug 12 '14 at 22:09

1

Although I understand your need for modularity, why not simply create a single script for the run of interest. Sourcing multiple scripts results in complexities of not being able to pass variables across scripts unless you write to files (which wastes CPU cycles). You could even build a master script that would read the text contents of each script and then create the master script and then run that script.

answered Aug 12 '14 at 22:09

LearnPKPD

65
3

5

Thanks, I appreciate your answer! One large script just seems clunky and sloppy, especially considering I'm going to be passing this project off after the summer to people with very little experience in R or programming. – Christian Aug 13 '14 at 02:28
8

I would say that in general it is better to write each piece of you code as small and modular function and then export them. In this way your central script can call the functions (rather than other scripts) and you avoid the issues of source(). Or define functions within the 10~15 scripts, source the scripts, and then call the functions. You were on the right track with the organizing R code link. – schifferl Jul 22 '16 at 19:17

Combine and run multiple R scripts from another script

5 Answers5

Linked