What's the difference between substitute and quote in R

Question

In the official docs, it says:

substitute returns the parse tree for the (unevaluated) expression expr, substituting any variables bound in env.

quote simply returns its argument. The argument is not evaluated and can be any R expression.

But when I try:

> x <- 1
> substitute(x)
x
> quote(x)
x

It looks like both quote and substitute returns the expression that's passed as argument to them.

So my question is, what's the difference between substitute and quote, and what does it mean to "substituting any variables bound in env"?

This is a very good resource, I doubt I can explain it better than Hadley - http://adv-r.had.co.nz/Computing-on-the-language.html — Rafael Zayas, Oct 19 '17 at 16:45

Josh O'Brien · Accepted Answer · 2020-06-20T22:12:43.293

58

Here's an example that may help you to easily see the difference between quote() and substitute(), in one of the settings (processing function arguments) where substitute() is most commonly used:

f <- function(argX) {
   list(quote(argX), 
        substitute(argX), 
        argX)
}
    
suppliedArgX <- 100
f(argX = suppliedArgX)
# [[1]]
# argX
# 
# [[2]]
# suppliedArgX
# 
# [[3]]
# [1] 100

edited Jun 20 '20 at 22:12

answered Oct 19 '17 at 17:11

Josh O'Brien

159,210
26
366
455

1

`substitute()` substitutes the symbol with the expression that was given as an argument. – its.me.adam Feb 28 '22 at 03:32

score 24 · Answer 2 · answered Oct 19 '17 at 16:57

R has lazy evaluation, so the identity of a variable name token is a little less clear than in other languages. This is used in libraries like dplyr where you can write, for instance:

summarise(mtcars, total_cyl = sum(cyl))

We can ask what each of these tokens means: summarise and sum are defined functions, mtcars is a defined data frame, total_cyl is a keyword argument for the function summarise. But what is cyl?

> cyl
Error: object 'cyl' not found

It isn't anything! Well, not yet. R doesn't evaluate it right away, but treats it as an expression to be parsed later with some parse tree that is different than the global environment your command line is working in, specifically one where the columns of mtcars are defined. Somewhere in the guts of dplyr, something like this is happening:

> substitute(cyl, mtcars)
[1] 6 6 4 6 8 ...

Suddenly cyl means something. That's what substitute is for.

So what is quote for? Well sometimes you want your lazily-evaluated expression to be represented somewhere else before it's evaluated, i.e. you want to display the actual code you're writing without any (or only some) values substituted. The docs you quoted explain this is common for "informative labels for data sets and plots".

So, for example, you could create a quoted expression, and then both print the unevaluated expression in your chart to show how you calculated and actually calculate with the expression.

expr <- quote(x + y)
print(expr) # x + y
eval(expr, list(x = 1, y = 2)) # 3

Note that substitute can do this expression trick also while giving you the option to parse only part of it. So its features are a superset of quote.

expr <- substitute(x + y, list(x = 1))
print(expr) # 1 + y
eval(expr, list(y = 2)) # 3

Thanks, that's the only answer I could understand! Was so hard to include an example like yours in the help or online documentation? =( — Leopoldo Sanczyk, Dec 15 '18 at 21:13
@LeopoldoSanczyk R documentation is notorious for being absolute trash — Stefano Borini, Jun 22 '23 at 10:41

score 15 · Answer 3 · answered Oct 19 '17 at 16:46

Maybe this section of the documentation will help somewhat:

Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in env, it is unchanged. If it is a promise object, i.e., a formal argument to a function or explicitly created using delayedAssign(), the expression slot of the promise replaces the symbol. If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.

Note the final bit, and consider this example:

e <- new.env()
assign(x = "a",value = 1,envir = e)
> substitute(a,env = e)
[1] 1

Compare that with:

> quote(a)
a

So there are two basic situations when the substitution will occur: when we're using it on an argument of a function, and when env is some environment other than .GlobalEnv. So that's why you particular example was confusing.

For another comparison with quote, consider modifying the myplot function in the examples section to be:

myplot <- function(x, y)
    plot(x, y, xlab = deparse(quote(x)),
             ylab = deparse(quote(y)))

and you'll see that quote really doesn't do any substitution.

Hmm, that's interesting, thank you for the answer. Just out of curiosity, why GlobalEnv is treated as an exception for substitute? Thank! — Lifu Huang, Oct 19 '17 at 17:46

score 6 · Answer 4 · answered Dec 27 '17 at 21:17

Regarding your question why GlobalEnv is treated as an exception for substitute, it is just a heritage of S. From The R language definition (https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Substitutions):

The special exception for substituting at the top level is admittedly peculiar. It has been inherited from S and the rationale is most likely that there is no control over which variables might be bound at that level so that it would be better to just make substitute act as quote.

What's the difference between substitute and quote in R

4 Answers4

Linked