106

I often have a main R Markdown file or knitr LaTeX file where I source some other R file (e.g., for data processing). However, I was thinking that in some instances it would be beneficial to have these sourced files be their own reproducible documents (e.g., an R Markdown file that not only includes commands for data processing but also produces a reproducible document that explains the data processing decisions).

Thus, I would like to have a command like source('myfile.rmd') in my main R Markdown file. that would extract and source all the R code inside the R code chunks of myfile.rmd. Of course, this gives rise to an error.

The following command works:

```{r message=FALSE, results='hide'}
knit('myfile.rmd', tangle=TRUE)
source('myfile.R')
```

where results='hide' could be omitted if the output was desired. I.e., knitr outputs the R code from myfile.rmd into myfile.R.

However, it doesn't seem perfect:

  • it results in the creation of an extra file
  • it needs to appear in its own code chunk if control over the display is required.
  • It's not as elegant as simple source(...).

Thus my question: Is there a more elegant way of sourcing the R code of an R Markdown file?

Hamada
  • 1,836
  • 3
  • 13
  • 27
Jeromy Anglim
  • 33,939
  • 30
  • 115
  • 173
  • I'm actually having a really hard time understanding your question (I read it several times). You can source other R scripts easily into a `Rmd` file. But you also want to source in other `markdown` files into a file being knitted? – Maiasaura Jun 10 '12 at 05:26
  • 4
    I want to source the R code inside R code chunks in R Markdown files (i.e., *.rmd)? I've edited the question a little bit to try to make things clearer. – Jeromy Anglim Jun 10 '12 at 06:20
  • Something along the lines of `include` in latex. If markdown supports inclusion of other markdown documents, it should be relatively easy to create such a function. – Paul Hiemstra Jun 10 '12 at 07:36
  • @PaulHiemstra I guess that the ability to source the text and R code chunks would be useful also. I'm specifically thinking of sourcing just the code in an R Markdown document. – Jeromy Anglim Jun 10 '12 at 11:30
  • I wrote a function for sourcing specific chunks in an RMD you can find here: https://gist.github.com/brshallo/e963b9dca5e4e1ab12ec6348b135362e – Bryan Shalloway Feb 02 '22 at 23:19

14 Answers14

43

It seems you are looking for a one-liner. How about putting this in your .Rprofile?

ksource <- function(x, ...) {
  library(knitr)
  source(purl(x, output = tempfile()), ...)
}

However, I do not understand why you want to source() the code in the Rmd file itself. I mean knit() will run all the code in this document, and if you extract the code and run it in a chunk, all the code will be run twice when you knit() this document (you run yourself inside yourself). The two tasks should be separate.

If you really want to run all the code, RStudio has made this fairly easy: Ctrl + Shift + R. It basically calls purl() and source() behind the scene.

jomuller
  • 1,032
  • 2
  • 10
  • 19
Yihui Xie
  • 28,913
  • 23
  • 193
  • 419
  • 11
    Hi @Yihui I think it is helpful because sometimes your analysis might be organized in small scripts, but in your report you want to have the code for the whole pipeline. – lucacerone Jul 24 '14 at 09:03
  • 11
    So the use case here is that you want to write all of the code and have it be heavily documented and explained, but the code is run by some other script. – Brash Equilibrium Aug 19 '14 at 21:14
  • 4
    @BrashEquilibrium It is a matter of using `source()` or `knitr::knit()` to run the code. I know people are less familiar with the latter, but `purl()` is not reliable. You have been warned: https://github.com/yihui/knitr/pull/812#issuecomment-53088636 – Yihui Xie Aug 22 '14 at 19:43
  • 6
    @Yihui What would be the proposed alternative to 'source(purl(x,...))' in your view? How can one source multiple *.Rmd-Files, without running into an error regarding duplicate chunk labels? I'd rather not want to go back to the to-be-sourced document and knit it. I use *.Rmd for many files, that I potentially have to export and discuss with others, so it would be great to be able source multiple Rmd-Files for all steps of the analysis. – stats-hb Jan 09 '17 at 11:58
  • knitr emits error "Error: Required package is missing", when it renders the .rmd file. I have to execute code in the .rmd file to find the real error message containing the name of the missing package. One case is `caret` required `kernlab` with svm. – C.W. Jun 22 '20 at 00:17
  • NoamRoss has a gist for source_rmd() which does this but also has an option for ignoring plots: "https://gist.github.com/noamross/a549ee50e8a4fd68b8b1" – Bryan Shalloway Dec 02 '20 at 18:47
23

Factor the common code out into a separate R file, and then source that R file into each Rmd file you want it in.

so for example let's say I have two reports I need to make, Flu Outbreaks and Guns vs Butter Analysis. Naturally I'd create two Rmd documents and be done with it.

Now suppose boss comes along and wants to see the variations of Flu Outbreaks versus Butter prices (controlling for 9mm ammo).

  • Copying and pasting the code to analyze the reports into the new report is a bad idea for code reuse, etc.
  • I want it to look nice.

My solution was to factor the project into these files:

  • Flu.Rmd
    • flu_data_import.R
  • Guns_N_Butter.Rmd
    • guns_data_import.R
    • butter_data_import.R

within each Rmd file I'd have something like:

```{r include=FALSE}
source('flu_data_import.R')
```

The problem here is that we lose reproducibility. My solution to that is to create a common child document to include into each Rmd file. So at the end of every Rmd file I create, I add this:

```{r autodoc, child='autodoc.Rmd', eval=TRUE}
``` 

And, of course, autodoc.Rmd:

Source Data & Code
----------------------------
<div id="accordion-start"></div>

```{r sourcedata, echo=FALSE, results='asis', warnings=FALSE}

if(!exists(autodoc.skip.df)) {
  autodoc.skip.df <- list()
}

#Generate the following table:
for (i in ls(.GlobalEnv)) {
  if(!i %in% autodoc.skip.df) {
    itm <- tryCatch(get(i), error=function(e) NA )
    if(typeof(itm)=="list") {
      if(is.data.frame(itm)) {
        cat(sprintf("### %s\n", i))
        print(xtable(itm), type="html", include.rownames=FALSE, html.table.attributes=sprintf("class='exportable' id='%s'", i))
      }
    }
  }
}
```
### Source Code
```{r allsource, echo=FALSE, results='asis', warning=FALSE, cache=FALSE}
fns <- unique(c(compact(llply(.data=llply(.data=ls(all.names=TRUE), .fun=function(x) {a<-get(x); c(normalizePath(getSrcDirectory(a)),getSrcFilename(a))}), .fun=function(x) { if(length(x)>0) { x } } )), llply(names(sourced), function(x) c(normalizePath(dirname(x)), basename(x)))))

for (itm in fns) {
  cat(sprintf("#### %s\n", itm[2]))
  cat("\n```{r eval=FALSE}\n")
  cat(paste(tryCatch(readLines(file.path(itm[1], itm[2])), error=function(e) sprintf("Could not read source file named %s", file.path(itm[1], itm[2]))), sep="\n", collapse="\n"))
  cat("\n```\n")
}
```
<div id="accordion-stop"></div>
<script type="text/javascript">
```{r jqueryinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://code.jquery.com/jquery-1.9.1.min.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r tablesorterinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://tablesorter.com/__jquery.tablesorter.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r jqueryuiinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://code.jquery.com/ui/1.10.2/jquery-ui.min.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r table2csvinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(file.path(jspath, "table2csv.js")), sep="\n")
```
</script>
<script type="text/javascript">
  $(document).ready(function() {
  $('tr').has('th').wrap('<thead></thead>');
  $('table').each(function() { $('thead', this).prependTo(this); } );
  $('table').addClass('tablesorter');$('table').tablesorter();});
  //need to put this before the accordion stuff because the panels being hidden makes table2csv return null data
  $('table.exportable').each(function() {$(this).after('<a download="' + $(this).attr('id') + '.csv" href="data:application/csv;charset=utf-8,'+encodeURIComponent($(this).table2CSV({delivery:'value'}))+'">Download '+$(this).attr('id')+'</a>')});
  $('#accordion-start').nextUntil('#accordion-stop').wrapAll("<div id='accordion'></div>");
  $('#accordion > h3').each(function() { $(this).nextUntil('h3').wrapAll("<div>"); });
  $( '#accordion' ).accordion({ heightStyle: "content", collapsible: true, active: false });
</script>

N.B., this is designed for the Rmd -> html workflow. This will be an ugly mess if you go with latex or anything else. This Rmd document looks through the global environment for all the source()'ed files and includes their source at the end of your document. It includes jquery ui, tablesorter, and sets the document up to use an accordion style to show/hide sourced files. It's a work in progress, but feel free to adapt it to your own uses.

Not a one-liner, I know. Hope it gives you some ideas at least :)

Keith Twombley
  • 1,666
  • 1
  • 17
  • 21
12

Try the purl function from knitr:

source(knitr::purl("myfile.rmd", quiet=TRUE))

Kalana
  • 5,631
  • 7
  • 30
  • 51
Petr Hala
  • 121
  • 1
  • 2
4

Probably one should start thinking different. My issue is the following: Write every code you normally would have had in a .Rmd chunk in a .R file. And for the Rmd document you use to knit i.e. an html, you only have left

```{R Chunkname, Chunkoptions}  
source(file.R)  
```

This way you'll probably create a bunch of .R files and you lose the advantage of processing all the code "chunk after chunk" using ctrl+alt+n (or +c, but normally this does not work). But, I read the book about reproducible research by Mr. Gandrud and realized, that he definitely uses knitr and .Rmd files solely for creating html files. The Main Analysis itself is an .R file. I think .Rmd documents rapidly grow too large if you start doing your whole analysis inside.

Sandy Muspratt
  • 31,719
  • 12
  • 116
  • 122
Pharcyde
  • 397
  • 1
  • 3
  • 14
3

If you are just after the code I think something along these lines should work:

  1. Read the markdown/R file with readLines
  2. Use grep to find the code chunks, searching for lines that start with <<< for example
  3. Take subset of the object that contains the original lines to get only the code
  4. Dump this to a temporary file using writeLines
  5. Source this file into your R session

Wrapping this in a function should give you what you need.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 1
    Thank you, I guess that would work. However, the first four points sound like what Stangle already does in a reliable way for Sweave and what `knit('myfile.rmd', tangle=TRUE)` does in knitr. I guess I'm looking for a one liner that both tangles and sources and ideally creates no files. – Jeromy Anglim Jun 10 '12 at 12:20
  • Once you wrap it in a function it becomes a oneliner ;). What you could do is use `textConnection` to mimic a file, and source from that. This would avoid a file being created. – Paul Hiemstra Jun 10 '12 at 12:27
  • Yes. `textConnection` might be the place to look. – Jeromy Anglim Jun 10 '12 at 12:30
3

The following hack worked fine for me:

library(readr)
library(stringr)
source_rmd <- function(file_path) {
  stopifnot(is.character(file_path) && length(file_path) == 1)
  .tmpfile <- tempfile(fileext = ".R")
  .con <- file(.tmpfile) 
  on.exit(close(.con))
  full_rmd <- read_file(file_path)
  codes <- str_match_all(string = full_rmd, pattern = "```(?s)\\{r[^{}]*\\}\\s*\\n(.*?)```")
  stopifnot(length(codes) == 1 && ncol(codes[[1]]) == 2)
  codes <- paste(codes[[1]][, 2], collapse = "\n")
  writeLines(codes, .con)
  flush(.con)
  cat(sprintf("R code extracted to tempfile: %s\nSourcing tempfile...", .tmpfile))
  source(.tmpfile)
}
qed
  • 22,298
  • 21
  • 125
  • 196
  • This is for me the best answers, as it supporting sourcing multiple rmarkdown files without having conflicts with the unnamed-chunks label, as it happens if you use `knitr` directly. – mone27 Jul 01 '21 at 07:20
3

I use the following custom function

source_rmd <- function(rmd_file){
  knitr::knit(rmd_file, output = tempfile())
}

source_rmd("munge_script.Rmd")
Joe
  • 3,217
  • 3
  • 21
  • 37
1

I would recommend keeping the main analysis and calculation code in .R file and importing the chunks as needed in .Rmd file. I have explained the process here.

Community
  • 1
  • 1
pbahr
  • 1,300
  • 12
  • 14
1

sys.source("./your_script_file_name.R", envir = knitr::knit_global())

put this command before calling the functions contained in the your_script_file_name.R.

the "./" adding before your_script_file_name.R to show the direction to your file if you already created a Project.

You can see this link for more detail: https://bookdown.org/yihui/rmarkdown-cookbook/source-script.html

Tranle
  • 101
  • 6
1

I use this one-liner:

```{r optional_chunklabel_for_yourfile_rmd, child = 'yourfile.Rmd'}
```

See: My .Rmd file becomes very lengthy. Is that possible split it and source() it's smaller portions from main .Rmd?

IVIM
  • 2,167
  • 1
  • 15
  • 41
1

I would say there is not a more elegant way of sourcing an Rmarkdown file. The ethos of Rmd being that the report is reproducible and at best will be self contained. However, adding to the OP's original solution, the below method avoids the permanent creation of the intermediate file on disk. It also makes some extra effort to ensure chunk output does not appear in the renderred document:

knit_loc <- tempfile(fileext = ".R")
knitr::knit("myfile.rmd",
            output = knit_loc,
            quiet = TRUE,
            tangle = TRUE)
invisible(capture.output(source(knit_loc, verbose = FALSE)))

I would also add that if the child markdown dependencies are external to your R environment (eg write a file to disk, download some external resource, interact with a Web api etc), then instead of knit() I would opt for rmarkdown::render() instead:

rmarkdown::render("myfile.rmd")
M_Merciless
  • 379
  • 6
  • 12
0

this worked for me

source("myfile.r", echo = TRUE, keep.source = TRUE)
user63230
  • 4,095
  • 21
  • 43
0

The answer by @qed is by far the best if you want to source the entire file. Kevin Keena built the function proposed by @Paul Hiemstra, and this can help you to convert your .Rmd into an .R file to then source the entire file (or parts of it) into another .R file, where knitr::purl would not be available.

terraviva
  • 11
  • 4
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/33917925) – Shawn Hemelstrand Feb 25 '23 at 16:34
0

I am surprised I have not seen this example online.

knitr::knit() has an option to choose which environment to knit a file in. If you choose the global environment, this functionally "sources" R Markdown documents if your focus is on an interactive session.

You can use the following code chunk in an R Markdown. "include=FALSE" can be changed to "include=TRUE" if you want to see all the file's output.

## Create Objects
```{r include=FALSE}
knitr::knit('create_objects.Rmd', envir = globalenv())
```

This may not be a solution for a document you want to publish -- I'm still testing what output looks like, but this is a solution for an R Notebook you use interactively.