What is the reason not to use string quotes for syntactic names?

Question

I have used double quotes for a syntactic name or for the name of a function and got a comment to use backtick quotes instead. Here I got a comment that it is totally fine to pass a function name as a character string to match.fun (and thus *apply functions or do.call).

A <- matrix(1:4, 2)
B <- matrix(4:1, 2)
apply(A, 2, `*`, B)  # Works: backtick quotes
apply(A, 2, "*", B)  # Works: double quotes
apply(A, 2, '*', B)  # Works: single quotes
# apply(A, 2, *, B)  # Error: unexpected '*' in "apply(A, 2, *"

`%x%` <- function(lhs, rhs) lhs * rhs  # Works: backtick quotes
"%x%" <- function(lhs, rhs) lhs * rhs  # Works: double quotes
'%x%' <- function(lhs, rhs) lhs * rhs  # Works: single quotes
# %x% <- function(lhs, rhs) lhs * rhs  # Error: unexpected SPECIAL in "%x%"

I would like to know what are the disadvantages when using single ' or double " quotes for function names instead of backtick quotes `? In which cases what quote type should be used?

This seems to be more of a convention, than a hard requirement, but IMO a very pervasive convention. You have two separate context in the example: using a function as an argument and defining a function/assignment. For passing an argument, you should read the function documentation (e.g. `apply` says you need to quote or backquote). Otherwise, for me, it's the syntax highlighting. If you use backquotes, all IDEs/syntax highlighting packages I know will keep the name the same as other code, making it clear it is an object, not a string. — Marcus, May 26 '23 at 16:18
I agree @Marcus. I think the biggest disadvantage is confusing coders that are unaware you can assign like that (with quotes and not ticks) — SmokeyShakers, May 26 '23 at 16:24
The two examples are not the same. It is totally fine to pass a function name as a character string to `match.fun` (and thus `*apply` functions or `do.call`). Just try to be consistent. Having a character string on the LHS of `<-` OTOH is just weird style. However, it can be useful if you have a keyboard where backticks are not very accessible. — Roland, May 26 '23 at 19:01

nicola · Accepted Answer · 2023-06-07T07:36:15.460

It has to be stressed that

`*`

and

"*"

by themselves are two different objects: the former is a function, while the latter is just a character vector. Whether using one or the other makes any difference, it depends on the use case. Since in most cases when you pass a function to another function, usually match.fun is invoked (and this happens to *apply, do.call and basically any base function that accepts functions as arguments), passing either object does not make any difference. However, if you use some function from external packages or other sources, you cannot be sure that a call to match.fun is performed. For example, say that you have this function:

ex_fun<-function(a, b, FUN) {
    return(FUN(a, b))
}

This works:

ex_fun(1, 3, `*`)
#[1] 3

This does not:

ex_fun(1, 3, "*")
#Error in FUN(a, b) : could not find function "FUN"

Other point: everything in R is a function, even assignment. So when you use something like:

var <- value

The above instruction is transformed by the parser as:

`<-`(var, value)

In this function, the parser allows var to be quoted, as for documentation, so this syntax is valid:

"foo" <- 3 * 3
foo
#[1] 9

But as before, foo and "foo" remain different objects.

Another implicit usage of match.fun happens when a function is invoked. When we treat a symbol like a function, the evaluation of an expression looks for a function and not for a generic object. For instance:

a <- 4
a(2)
#Error in a(2) : could not find function "a"

The error message is clear: it's not due to the fact that a is not a function object, but by the fact that an object of mode function named a does not exists. For instance, we can declare objects named like base functions and still the parser will call the function when an invocation is made:

log <- 7
log(2) 
#[1] 0.6931472
log
#[1] 7

When the parser understands that we are calling a function, it calls match.fun. This works:

"*"(3, 4)
#[1] 12

This does not, of course:

FUN <- "*"
FUN(3, 4)
#Error in FUN(3, 4) : could not find function "FUN"

"an implicit call to as.symbol is performed on var and so this syntax is valid". I believe this is incorrect. These are just two syntaxes accepted by the parser. no symbol object is created in either scenario — moodymudskipper, Jun 02 '23 at 22:31
@moodymudskipper You are likely correct. I made an edit to delete reference to an as.symbol call. Thank you. — nicola, Jun 04 '23 at 05:46
"the former is a function" No, it is not. It is simply a name (a symbol). A function happens to be bound to that name. — Roland, Jun 07 '23 at 09:27
@Roland You are technically correct, but you should say the same to any object. For instance, if you have `x<-1:10`, by the same logic, you shouldn't say that `x` is an integer vector, but just a name which happens to have an integer vector bound to it. — nicola, Jun 07 '23 at 09:56
@nicola Yes, of course. But I feel the distinction is important in the context being discussed. — Roland, Jun 07 '23 at 10:12

benson23 · Answer 2 · 2023-05-26T16:23:52.537

From section 2.2.1 of Advanced R by Hadley Wickham:

You can also create non-syntactic bindings using single or double quotes (e.g. "_abc" <- 1) instead of backticks, but you shouldn’t, because you’ll have to use a different syntax to retrieve the values. The ability to use strings on the left hand side of the assignment arrow is an historical artefact, used before R supported backticks.

And from ?Quotes (my bold):

Identifiers consist of a sequence of letters, digits, the period (.) and the underscore. They must not start with a digit nor underscore, nor with a period followed by a digit. Reserved words are not valid identifiers.
...
Such identifiers are also known as syntactic names and may be used directly in R code. Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (‘⁠`⁠’), and deparse will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: see formula.

score 2 · Answer 3 · answered Jun 02 '23 at 22:51

Using double quotes creates a challenge for static analysis.

getParseData(parse(text = 'x <- 1; x(1); c(x = 1)'))
#>    line1 col1 line2 col2 id parent                token terminal text
#> 7      1    1     1    6  7      0                 expr    FALSE     
#> 1      1    1     1    1  1      3               SYMBOL     TRUE    x
#> 3      1    1     1    1  3      7                 expr    FALSE     
#> 2      1    3     1    4  2      7          LEFT_ASSIGN     TRUE   <-
#> 4      1    6     1    6  4      5            NUM_CONST     TRUE    1
#> 5      1    6     1    6  5      7                 expr    FALSE     
#> 6      1    7     1    7  6      0                  ';'     TRUE    ;
#> 19     1    9     1   12 19      0                 expr    FALSE     
#> 10     1    9     1    9 10     12 SYMBOL_FUNCTION_CALL     TRUE    x
#> 12     1    9     1    9 12     19                 expr    FALSE     
#> 11     1   10     1   10 11     19                  '('     TRUE    (
#> 13     1   11     1   11 13     14            NUM_CONST     TRUE    1
#> 14     1   11     1   11 14     19                 expr    FALSE     
#> 15     1   12     1   12 15     19                  ')'     TRUE    )
#> 20     1   13     1   13 20      0                  ';'     TRUE    ;
#> 34     1   15     1   22 34      0                 expr    FALSE     
#> 23     1   15     1   15 23     25 SYMBOL_FUNCTION_CALL     TRUE    c
#> 25     1   15     1   15 25     34                 expr    FALSE     
#> 24     1   16     1   16 24     34                  '('     TRUE    (
#> 26     1   17     1   17 26     34           SYMBOL_SUB     TRUE    x
#> 27     1   19     1   19 27     34               EQ_SUB     TRUE    =
#> 28     1   21     1   21 28     29            NUM_CONST     TRUE    1
#> 29     1   21     1   21 29     34                 expr    FALSE     
#> 30     1   22     1   22 30     34                  ')'     TRUE    )

# backticks don't change the way it's parsed
getParseData(parse(text = '`x` <- 1; `x`(1); c(`x` = 1)')) 
#>    line1 col1 line2 col2 id parent                token terminal text
#> 7      1    1     1    8  7      0                 expr    FALSE     
#> 1      1    1     1    3  1      3               SYMBOL     TRUE  `x`
#> 3      1    1     1    3  3      7                 expr    FALSE     
#> 2      1    5     1    6  2      7          LEFT_ASSIGN     TRUE   <-
#> 4      1    8     1    8  4      5            NUM_CONST     TRUE    1
#> 5      1    8     1    8  5      7                 expr    FALSE     
#> 6      1    9     1    9  6      0                  ';'     TRUE    ;
#> 19     1   11     1   16 19      0                 expr    FALSE     
#> 10     1   11     1   13 10     12 SYMBOL_FUNCTION_CALL     TRUE  `x`
#> 12     1   11     1   13 12     19                 expr    FALSE     
#> 11     1   14     1   14 11     19                  '('     TRUE    (
#> 13     1   15     1   15 13     14            NUM_CONST     TRUE    1
#> 14     1   15     1   15 14     19                 expr    FALSE     
#> 15     1   16     1   16 15     19                  ')'     TRUE    )
#> 20     1   17     1   17 20      0                  ';'     TRUE    ;
#> 34     1   19     1   28 34      0                 expr    FALSE     
#> 23     1   19     1   19 23     25 SYMBOL_FUNCTION_CALL     TRUE    c
#> 25     1   19     1   19 25     34                 expr    FALSE     
#> 24     1   20     1   20 24     34                  '('     TRUE    (
#> 26     1   21     1   23 26     34           SYMBOL_SUB     TRUE  `x`
#> 27     1   25     1   25 27     34               EQ_SUB     TRUE    =
#> 28     1   27     1   27 28     29            NUM_CONST     TRUE    1
#> 29     1   27     1   27 29     34                 expr    FALSE     
#> 30     1   28     1   28 30     34                  ')'     TRUE    )

# quotes do, and it can mess with static analysis
getParseData(parse(text = '"x" <- 1; "x"(1); c("x" = 1)'))
#>    line1 col1 line2 col2 id parent                token terminal text
#> 7      1    1     1    8  7      0                 expr    FALSE     
#> 1      1    1     1    3  1      3            STR_CONST     TRUE  "x"
#> 3      1    1     1    3  3      7                 expr    FALSE     
#> 2      1    5     1    6  2      7          LEFT_ASSIGN     TRUE   <-
#> 4      1    8     1    8  4      5            NUM_CONST     TRUE    1
#> 5      1    8     1    8  5      7                 expr    FALSE     
#> 6      1    9     1    9  6      0                  ';'     TRUE    ;
#> 19     1   11     1   16 19      0                 expr    FALSE     
#> 10     1   11     1   13 10     12            STR_CONST     TRUE  "x"
#> 12     1   11     1   13 12     19                 expr    FALSE     
#> 11     1   14     1   14 11     19                  '('     TRUE    (
#> 13     1   15     1   15 13     14            NUM_CONST     TRUE    1
#> 14     1   15     1   15 14     19                 expr    FALSE     
#> 15     1   16     1   16 15     19                  ')'     TRUE    )
#> 20     1   17     1   17 20      0                  ';'     TRUE    ;
#> 34     1   19     1   28 34      0                 expr    FALSE     
#> 23     1   19     1   19 23     25 SYMBOL_FUNCTION_CALL     TRUE    c
#> 25     1   19     1   19 25     34                 expr    FALSE     
#> 24     1   20     1   20 24     34                  '('     TRUE    (
#> 26     1   21     1   23 26     34            STR_CONST     TRUE  "x"
#> 27     1   25     1   25 27     34               EQ_SUB     TRUE    =
#> 28     1   27     1   27 28     29            NUM_CONST     TRUE    1
#> 29     1   27     1   27 29     34                 expr    FALSE     
#> 30     1   28     1   28 30     34                  ')'     TRUE    )

On the other hand double/single quotes give better traces, and in some older versions of R the pretty version below was also faster.

f <- function (...) {
  stop (...)
}
ugly <- function(x, ...) {
  do.call(f, list("!",  "!" ))
}
ugly()
traceback()
# 4: stop(...) at #2
# 3: (function (...) 
# {
#     stop(...)
# })("!", "!")
# 2: do.call(f, list("!", "!")) at #2
# 1: ugly()

pretty <- function(x, ...) {
  do.call("f", list("!",  "!" ))
}
pretty()
traceback()
# 4: stop(...) at #2
# 3: f("!", "!")
# 2: do.call("f", list("!", "!")) at #2
# 1: pretty()

^{Created on 2023-06-03 with reprex v2.0.2}

Using only backquotes you'll be more consistent
But to my taste "this that" <- 1 actually looks nicer because you see that something funky is happening
dplyr documents c("x_a" = "y_a", "x_b" = "y_b") for joins, it is more symetrical
It's really mostly a matter of taste

We talked about it here: https://twitter.com/antoine_fabri/status/1579863982294601728

What is the reason not to use string quotes for syntactic names?

3 Answers3

Linked