16

I have the following Rmd file I called test.Rmd:

---
title: "test"
output: html_document
---

```{r}
print(y)
```

```{r}
x <- "don't you ignore me!"
print(x)
```

I want to call render the following way:

render('test.Rmd', output_format = "html_document",
        output_file = 'test.html',
        envir = list(y="hello"))

but it fails:

processing file: test.Rmd
  |................                                                 |  25%
  ordinary text without R code

  |................................                                 |  50%
label: unnamed-chunk-1
  |.................................................                |  75%
  ordinary text without R code

  |.................................................................| 100%
label: unnamed-chunk-2
Quitting from lines 11-13 (test.Rmd) 
Error in print(x) : object 'x' not found

The first chunk went just fine, so something has worked. If I define y in my global environment I can run it without the envir argument and it works fine.

I figured maybe render doesn't like lists, so let's give it a proper environment :

y_env <- as.environment(list(y="hello"))
ls(envir = y_env)
# [1] "y"

render('test.Rmd', output_format = "html_document",
       output_file = 'test.html',
       envir = y_env)

But it's even worse, it doesn't find print !

processing file: test.Rmd
  |................                                                 |  25%
  ordinary text without R code

  |................................                                 |  50%
label: unnamed-chunk-1
Quitting from lines 7-8 (test.Rmd) 
Error in eval(expr, envir, enclos) : could not find function "print"

Now the docs mentions using the function new.env so out of despair I try this :

y_env <- new.env()
y_env$y <- "hello"
render('test.Rmd', output_format = "html_document",
       output_file = 'test.html',
       envir = y_env)

And now it works!

processing file: test.Rmd
  |................                                                 |  25%
  ordinary text without R code

  |................................                                 |  50%
label: unnamed-chunk-1
  |.................................................                |  75%
  ordinary text without R code

  |.................................................................| 100%
label: unnamed-chunk-2

output file: test.knit.md

"C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS test.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template "**redacted**\RMARKD~1\rmd\h\DEFAUL~1.HTM" --no-highlight --variable highlightjs=1 --variable "theme:bootstrap" --include-in-header "**redacted**\AppData\Local\Temp\RtmpGm9aXz\rmarkdown-str3f6c5101cb3.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" 

Output created: test.html

So I'm confused about several things, to recap :

  • Why does render recognize lists (first chunk didn't fail) but then ignores regular assignments in the chunks
  • Why doesn't my second try work and how is it different from my third try ?
  • Is this a bug ?
  • What's the idiomatic way to do this ?
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • 2
    I think I got most of it, `as.environment` creates by default an environment under the empty environment by default, so nothing is available there not even `print`. `new.env` on the other hand creates an environment under `parent.frame()`, where `print` is visible, so not a bug. I still don't understand the first case, but it's undocumented so not really a bug (though would be a handy feature to make it work). – moodymudskipper Oct 09 '18 at 16:33
  • Just FYI, I've figured this out, and will revise my (currently deleted) answer accordingly once I have a few free minutes. – Josh O'Brien Oct 10 '18 at 16:35
  • thank you, also FYI : https://github.com/rstudio/rmarkdown/issues/1462 – moodymudskipper Oct 10 '18 at 17:12
  • The reason it accepts lists is because all the accessor functions are the same for lists and environment. E.g., you can do `get("y", envir)` with either a list or a real environment. – thc Oct 10 '18 at 22:30
  • OK, I finally got around to finishing the corrected version of my answer. (The problem with the previous version is that it described the evaluation pathway for options passed to **knitr** (which use `eval()`), which turns out not to be the same evaluation pathway used for code chunks (which use `evaluate()`). – Josh O'Brien Oct 15 '18 at 00:32
  • 2
    I'm genuinely humbled that you gave all this attention to my largely ignored question. Top notch answer, you're a true SO hero! – moodymudskipper Oct 15 '18 at 07:11
  • 1
    Well, it was a very good question. (On another topic, I have to say I really appreciate your offer to work for free for charities in need. I'll have to put a similar offer on my user page as well, once I get around to filling it out!) – Josh O'Brien Oct 15 '18 at 15:15

1 Answers1

12

Your two first examples fail for different reasons. To understand both failures, it's first important to know a bit about how code chunks are evaluated by knitr and rmarkdown.


knitr's general code chunk evaluation procedure

When you call rmarkdown::render() on your file, each code chunk is ultimately evaluated by a call to evaluate::evaluate(). In terms of its evaluation behavior and scoping rules, evaluate() behaves almost exactly like the base R function eval().

(Where evaluate::evaluate() differs most from eval() is in how it handles the output of each evaluated expression. As explained in ?evaluate, in addition to evaluating the expression passed as its first argument, it "captures all of the information necessary to recreate the output as if you had copied and pasted the code into an R terminal". That info includes plots and warning and error messages, which is why it's so handy in a package like knitr!)

In any case, the eventual call to evaluate(), from within the function knitr:::block_exec(), looks something like this

evaluate::evaluate(code, envir = env, ...)

in which:

  • code is a vector of character strings giving the (possibly multiple) expressions making up the current chunk.

  • env is value that you supplied the envir formal argument in your original call to rmarkdown::render().


Your first example

In your first example, envir is a list, not an environment. When that is the case, evaluation is carried out in a local environment created by the function call. Unresolved symbols (as documented in both ?eval and ?evaluate) are looked for first in the list passed a envirand then in the chain of environments beginning with that given by the enclos argument. Assignments, crucially, are local to the temporary evaluation environment, which goes out of existence once the function call is complete.

Because evaluate() operates, one at a time, on a character vector of expressions, when envir is a list, variables created in one of those expression won't be available for use in the subsequent expressions.

When the envir argument to rmarkdown::render() is a list, your code block ultimately gets evaluated by a call like this:

library(evaluate)
code <- c('x <- "don\'t you ignore me!"',
          'print(x)')
env <- list(y = 1:10)
evaluate(code, envir = env)

## Or, for prettier printing:
replay(evaluate(code, envir = env))
## > x <- "don't you ignore me!"
## > print(x)
## Error in print(x): object 'x' not found

The effect is exactly the same as if you did this with eval():

env <- list(y =1 :10)
eval(quote(x <- "don't you ignore me"), envir = env)
eval(quote(x), envir = env)
## Error in eval(quote(x), envir = env) : object 'x' not found

Your second example

When envir= is an environment returned by as.environment(list()), you get errors for a different reason. In that case, your code block ultimately gets evaluated by a call like this:

library(evaluate)
code <- c('x <- "don\'t you ignore me!"',
          'print(x)')
env <- as.environment(list(y = 1:10))
evaluate(code, envir = env)

## Or, for prettier printing:
replay(evaluate(code, envir = env))
## > x <- "don't you ignore me!"
## Error in x <- "don't you ignore me!": could not find function "<-"
## > print(x)
## Error in print(x): could not find function "print"

As you've noted, this fails because as.environment() returns an environment whose enclosing environment is the empty environment (i.e. the environment returned by emptyenv()). evaluate() (like eval() would) looks for the symbol <- in env and, when it doesn't find it there, starts up the chain of enclosing environments which, here, don't contain any match. (Recall also that when envir is an environment, rather than a list, the enclos argument is not used.)


Recommended solution

To do what you want, you'll need to create an environment that: (1) contains all of the objects in your list and that; (2) has as its enclosing environment the parent environment of your call to render() (i.e. the environment in which a call to render() is normally evaluated). The most succinct way to do that is to use the nifty list2env() function, like so:

env <- list2env(list(y="hello"), parent.frame())
render('test.Rmd', output_format = "html_document",
        output_file = 'test.html',
        envir = env)

Doing so will result in your code chunks being evaluated by code like the following, which is what you want:

library(evaluate)
code <- c('x <- "don\'t you ignore me!"',
          'print(x)')
env <- list2env(list(y = 1:10), envir = parent.frame())
evaluate(code, envir = env)
replay(evaluate(code, envir = env))
## > x <- "don't you ignore me!"
## > print(x)
## [1] "don't you ignore me!"
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • Thank you for the neat answer Josh. It is still not clear to me how `x` was not found though, as it's defined in the line just before the error is triggered, so in the local environment. Could you build a variation of your code block that would return that `x` object is not found ? Now to be pragmatic, I think what I may need to knit in a clean environment, but still accessing base functions and functions from attached packages, `y` is to call render with `envir= list2env(list(y="hello"), parent.frame())`, do you agree ? – moodymudskipper Oct 09 '18 at 22:38
  • Great answer, Josh! @Moody_Mudskipper see [this answer by Yihui](https://stackoverflow.com/a/48494678/2706569) (in particular point 2) about knitting in a clean environment. – CL. Oct 15 '18 at 10:41
  • Thanks CL. @Josh in order to make our environment "even cleaner" I think it could also be nice to position it just before `"package:stats"` on the search path so the `render` call becomes more reproducible and package functions need to be called with `pkg::fun` in the rmd file. Would you know how to achieve this ? I think it's in scope (if it's a short fix) but I could also post a new question. – moodymudskipper Oct 15 '18 at 16:00
  • I think I know how to do it, we can detach all the package environments before `"package:stat"`, render with the strategy described in this answer, then reattach them. The namespaces are not unloaded so reattaching is fast, and the neat thing with this strategy is the rmd can include library calls so we don't have to use only `pkg::fun` notation – moodymudskipper Oct 15 '18 at 19:59
  • 1
    @Moody_Mudskipper -- I agree this might be better as a new question. As you probably know, detaching packages doesn't really completely remove their footprints from a session (see e.g. the question, answer, and comments [here](https://stackoverflow.com/a/11005886/980833)), so I tend not to like that as a solution. If you want a clean session, it's generally better to launch a new R process. Yihui discusses that approach in the answer linked by CL above. It's also a strategy liberally employed by the **devtools** package and used by a number of functions in the base R **tools** package. – Josh O'Brien Oct 15 '18 at 20:50
  • Your link is fascinating, namespaces are hard to tame, I tried and failed. However it seems that it can mostly go wrong because of registered methods in very specific cases. Balancing this with the cost of knitting in a new session (can be super slow on a network drive) I decided to go for it anyway and did this, ignoring namespace issues, which seems to work : https://gist.github.com/moodymudskipper/ec838d6b87ad823c9089dc1ed9d601a6 . I won't ask a new question about it now. I just leaving it here for info. Anyone's welcome to comment there. – moodymudskipper Oct 16 '18 at 01:12