1

I'm working on a RMarkdown document that uses objects that take a long time to create and transform. The syntax is similar to this:

---
title: "Example"
author: "Test"
date: "October 29, 2015"
output: pdf_document
---

Example

```{r}
test_exc <- "NO"
if(exists("some_dta") == FALSE) {
  set.seed(1)
  # This data is big and messy to transform and I don't want to do it twice
  some_dta <- data.frame(speed=runif(n = 1000),nonsense=runif(1000))
  test_exc <- "YES"
}
```

You can also embed plots, for example:

```{r, echo=FALSE}
plot(some_dta)
```

Was the code executed: `r test_exc`

As suggested in the code above I would like to avoid repeated execution of the code if(exists("some_dta") == FALSE) { ... }. As illustrated in the code below the code within the loop executes:

Markdown results

I would like to know if there is a way of forcing RStudio markdown creation mechanism to understand that I those objects exists somewhere and there is no need to create them again.

Konrad
  • 17,740
  • 16
  • 106
  • 167

2 Answers2

2

You might want to use caching, as described in the online knitr documentation, e.g.:

---
title: "Example"
author: "Test"
date: "October 29, 2015"
output: pdf_document
---

Example

```{r chunk1,cache=TRUE}
  set.seed(1)
  # This data is big and messy to transform and I don't want to do it twice
  some_dta <- data.frame(speed=runif(n = 1000),nonsense=runif(1000))
}
```
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
2

You could save your data to an .rds object and then run a check to see if that file exists

```{r}
if(!file.exists("some_dta.rds")) {
  set.seed(1)
  # This data is big and messy to transform and I don't want to do it twice
  some_dta <- data.frame(speed=runif(n = 1000),nonsense=runif(1000))
  saveRDS(some_dta, file='some_dta.rds')
} else {
   some_dta <- readRDS('some_dta.rds')
}
```
DunderChief
  • 726
  • 4
  • 7
  • Thanks for showing the interest. I'm using `.rds` files a lot. Ideally, I wouldn't like to create a lot of them, also I link to network shares in this file so pushing massive files through network does not help the speed. – Konrad Oct 29 '15 at 13:48
  • 1
    @Konrad So you want to pull objects currently stored in memory from your console session? [See this question](http://stackoverflow.com/questions/11155182/is-there-a-way-to-knitr-markdown-straight-out-of-your-workspace-using-rstudio). – DunderChief Oct 29 '15 at 14:23
  • Reading objects already loaded to the global environment would be ideal solution. I could then modify them easily and avoid re-creating while compiling the document. Cache option will work but being able to load from the global environment would be ideal. – Konrad Nov 01 '15 at 20:45
  • 1
    @Konrad in the post I linked, it says you can knit manually instead of using rstudio, but I have never tried this. To me this approach of pulling in variables from your current R session defeats the purpose of creating a reproducible document. But maybe it's appropriate for your specific task. – DunderChief Nov 02 '15 at 22:51