When a document is knitted, a new environment is created within R, and therefore any settings in the global environment will not be passed to the document. However, this is done intentionally, as accidentally referencing an object in the global environment is an easy thing to break a reproducible analysis, and therefore making a clean session each time means the RMarkdown file runs on its own, regardless of the global environment settings.
If you do have a use case which justifies preloading the data, there are a few things you can do.
Example Data
Firstly I have created a minimal Rmd file as below called "RenderTest.Rmd":
title: "Render"
author: "Michael Harper"
date: "7 November 2017"
output: pdf_document
---
```{r cars}
summary(cars2)
```
In this example, cars2
is a set of data I am referencing to from my global session. Run on its using the "Knit" command in RStudio, this will return the following error:
Error in summary(cars): object 'cars2' not found: ... withCallignHandlers -> withVisible -> eval -> eval -> summary
Execution halted
Option 1: Manually Call the render function
The render
function from rmarkdown
can be called from another R script. This by default does not create a fresh environment for the script to run in, so you can use any parameters already loaded. As an example:
# Build file
library(rmarkdown)
cars2<- cars
render("RenderTest.Rmd")
I would, however, be careful doing this. Firstly, the benefit of using RMarkdown is that it makes reproducibility of the script is incredibly easy. As soon as you start using external scripts, it makes things more complicated to replicate as all the settings are not contained within the file.
Option 2: Save data to an R object
If you have some analysis which takes time to run, you can save the result of the analysis as an R object, and then you can reload the final version of the data into the session. Using my above example:
```{r dataProcess, cache = TRUE}
cars2 <- cars
save(cars2, "carsData.RData") # saves the 'cars2' dataset
```
and then we can just reload the data into the session:
```{r}
load("carsData.RData") # reloads the 'cars2' dataset
```
I prefer this technique. The chunk dataProcess
is cached, so is only run if there are changes made to the code. The results are saved to file, which are then loaded by the next chunk. The data still has to be loaded into the session, but you can save the finalised dataset if you need to do any data cleaning.
Option 3: Build the file less frequently
With the updates made to RStudio over the past few years, there is less of a need to continuously rebuild the file. Chunks can be run directly within the file, and the output window viewed. It will potentially save you a lot of time trying to optimise the script, only to save a couple of minutes on compiling (which normally makes a good time to get a hot drink anyway!).
