935

What are the differences between the assignment operators = and <- in R?

I know that operators are slightly different, as this example shows

x <- y <- 5
x = y = 5
x = y <- 5
x <- y = 5
# Error in (x <- y) = 5 : could not find function "<-<-"

But is this the only difference?

user438383
  • 5,716
  • 8
  • 28
  • 43
csgillespie
  • 59,189
  • 14
  • 150
  • 185
  • 62
    As noted [here](http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html) the origins of the `<-` symbol come from old APL keyboards that actually had a single `<-` key on them. – joran Dec 12 '14 at 17:35

9 Answers9

786

The difference in assignment operators is clearer when you use them to set an argument value in a function call. For example:

median(x = 1:10)
x   
## Error: object 'x' not found

In this case, x is declared within the scope of the function, so it does not exist in the user workspace.

median(x <- 1:10)
x    
## [1]  1  2  3  4  5  6  7  8  9 10

In this case, x is declared in the user workspace, so you can use it after the function call has been completed.


There is a general preference among the R community for using <- for assignment (other than in function signatures) for compatibility with (very) old versions of S-Plus. Note that the spaces help to clarify situations like

x<-3
# Does this mean assignment?
x <- 3
# Or less than?
x < -3

Most R IDEs have keyboard shortcuts to make <- easier to type. Ctrl + = in Architect, Alt + - in RStudio (Option + - under macOS), Shift + - (underscore) in emacs+ESS.


If you prefer writing = to <- but want to use the more common assignment symbol for publicly released code (on CRAN, for example), then you can use one of the tidy_* functions in the formatR package to automatically replace = with <-.

library(formatR)
tidy_source(text = "x=1:5", arrow = TRUE)
## x <- 1:5

The answer to the question "Why does x <- y = 5 throw an error but not x <- y <- 5?" is "It's down to the magic contained in the parser". R's syntax contains many ambiguous cases that have to be resolved one way or another. The parser chooses to resolve the bits of the expression in different orders depending on whether = or <- was used.

To understand what is happening, you need to know that assignment silently returns the value that was assigned. You can see that more clearly by explicitly printing, for example print(x <- 2 + 3).

Secondly, it's clearer if we use prefix notation for assignment. So

x <- 5
`<-`(x, 5)  #same thing

y = 5
`=`(y, 5)   #also the same thing

The parser interprets x <- y <- 5 as

`<-`(x, `<-`(y, 5))

We might expect that x <- y = 5 would then be

`<-`(x, `=`(y, 5))

but actually it gets interpreted as

`=`(`<-`(x, y), 5)

This is because = is lower precedence than <-, as shown on the ?Syntax help page.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • 12
    This is also mentioned in chapter 8.2.26 of [The R Inferno](http://www.burns-stat.com/pages/Tutor/R_inferno.pdf) by Patrick Burns (Not me but a recommendation anyway) – Uwe Jun 14 '16 at 09:17
  • 5
    I just realised that your explanation of how `x <- x = 5` gets interpreted is slightly wrong: In reality, R interprets it as ``​`<-<-`(x, y = 5, value = 5)`` (which itself is more or less equivalent to ``tmp <- x; x <- `<-<-`(tmp, y = 5, value = 5)``). Yikes! – Konrad Rudolph Jul 27 '18 at 11:25
  • 12
    … And I just realised that the very first part of this answer is incorrect and, unfortunately, quite misleading because it perpetuates a common misconception: The way you use `=` in a function call **does not perform assignment**, and isn’t an assignment operator. It’s an entirely distinct parsed R expression, which just happens to use the same character. Further, the code you show does not “declare” `x` in the scope of the function. The *function declaration* performs said declaration. The function call doesn’t (it gets a bit more complicated with named `...` arguments). – Konrad Rudolph Apr 12 '19 at 10:33
  • It seems to me that the "=" operator in the arguments of function call has pretty-much the same semantics as it does outside the arglist. In both contexts, it is binding a name to an object. In both contexts, there's a copy-on-write semantics, whereby any change to the object under its "new" name will cause the object to be copied (except in the case of a data.table). In both cases, there's plenty of room for semantic confusion about whether the copy will be "deep" or "shallow". But... semantics in R is like grammar in English, in that it's a convenience rather than a formality. – Clark Thomborson Nov 24 '22 at 02:30
  • What output would you expect from this? `f1 <- function(b = ifelse( ("b" %in% ls()), b, b<-0), x = x <- b + 100) {` `b <<- b + x + 1000` `return(b)` `}` `f1()` Hint: All assignment operators (AFAIK!) will bind the name on the LHS to a promise on the RHS, but ... depending on the particular operator and its context... the promise might be evaluated immediately or at some later date; the binding might be completed at parse-time (e.g. in a function call), or at runtime; and the name-lookup may occur in different environments. Yeah as if *that* answer will help anyone ;-) – Clark Thomborson Nov 26 '22 at 09:15
  • 1
    @ClarkThomborson The semantics *are* fundamentally different because in R assignment is a regular operation which is performed via a *function call* to an assignment function. However, this is *not* the case for `=` in an argument list. In an argument list, `=` is an arbitrary separator token which is no longer present after parsing. After parsing `f(x = 1)`, R sees (essentially) `call("f", 1)`. Whereas for `x = 1` R sees `call("=", "x", 1)`. It's true that in both cases name binding *also* happens but, for the assignment operator, it happens after calling the assignment operator function. – Konrad Rudolph Jan 12 '23 at 15:35
  • @Konrad Rudolph I think we're differing on our definition of the boundary between "syntax" and "semantics". It's an unclear boundary in the case of any interpreted language, such as R, which allows a string of characters to be passed to the parser, at runtime, and then executed. R goes one step further than most interpreted languages, by allowing a running code to modify its own parse tree! Much of the wizardry in 'data.table' involves such modification. If R had a formal syntax, I think the argument list of a method-call would be a string whose final semantics is defined at runtime. – Clark Thomborson Jan 13 '23 at 22:39
  • @ClarkThomborson I don’t follow. R *has* a formal syntax. And while I agree that R’s ability to transform unevaluated expressions at runtime is pretty cool, it doesn’t come into play here. I merely borrowed the `call` syntax to format the R parse tree in a readable way for the comment, in order to illustrate that the `=` token in argument assignment does not lead to a function call action at runtime in the way that assignment does (see our previous discussion in chat). – Konrad Rudolph Jan 14 '23 at 18:56
  • FYI the Jupyter notebook uses the same shortcut for `<-` as RStudio. (In Jupyter lab you need an extension or [tweak the settings yourself](https://gist.github.com/janxkoci/da084bafb09fef8c5dae36b1313c35c4).) – jena Mar 14 '23 at 11:07
252

What are the differences between the assignment operators = and <- in R?

As your example shows, = and <- have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax in R gives the following operator precedence table, from highest to lowest:

…
‘-> ->>’           rightwards assignment
‘<- <<-’           assignment (right to left)
‘=’                assignment (right to left)
…

But is this the only difference?

Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps claims that there are more differences:

The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

Let’s not put too fine a point on it: the R documentation is wrong. This is easy to show: we just need to find a counter-example of the = operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}). — Without further ado:

x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1

Clearly we’ve performed an assignment, using =, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?

It’s because in R’s syntax the symbol = has two distinct meanings that get routinely conflated (even by experts, including in the documentation cited above):

  1. The first meaning is as an assignment operator. This is all we’ve talked about so far.
  2. The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the = operator it performs no action at runtime, it merely changes the way an expression is parsed.

So how does R decide whether a given usage of = refers to the operator or to named argument passing? Let’s see.

In any piece of code of the general form …

‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)

… the = is the token that defines named argument passing: it is not the assignment operator. Furthermore, = is entirely forbidden in some syntactic contexts:

if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …

Any of these will raise an error “unexpected '=' in ‹bla›”.

In any other context, = refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:

median((x = 1 : 10))

But also:

if (! (nf = length(from))) return()

Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy function (replacing <- with =) — it’s a pervasive pattern in much of the core R codebase.

The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:

[= assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.


In sum, by default the operators <- and = do the same thing. But either of them can be overridden separately to change its behaviour. By contrast, <- and -> (left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 4
    About the precedence, and errors in R's doc, the precedence of `?` is actually right in between `=` and `<-`, which has important consequences when overriding `? `, and virtually none otherwise. – moodymudskipper Jan 10 '20 at 00:12
  • 2
    @Moody_Mudskipper that’s bizarre! You seem to be right, but according to the *source code* ([`main/gram.y`](https://github.com/wch/r-source/blob/386c3a93cbcaf95017fa6ae52453530fb95149f4/src/main/gram.y#L384-L390)), the precedence of `?` is correctly documented, and is lower than both `=` and `<-`. – Konrad Rudolph Jan 10 '20 at 10:13
  • 1
    I like your explanation of R semantics... which I'd rephrase as follows. The "=" operator is overloaded. Its base semantics is to bind a formal name to an actual parameter in the arglist of a function call. In most (but not all!) contexts outside a function call, it has the same semantics as "<-": it binds a name to an existing object (or to a constant value), with copy-on-write semantics, with the side-effect of defining this name if it is currently undefined. In a few contexts, it is bound to stop() to warn naive or careless users who confuse it with the "==" operator. – Clark Thomborson Nov 24 '22 at 02:50
  • 1
    @ClarkThomborson I don't agree with calling one of the meanings the "base" semantics, because this implies a hierarchy that doesn't exist. And I think it's confusing to call `=` an overloaded operator, as well: the term "operator" in R has (until R 4.0, at least!) a specific meaning referring to a function call with special syntactic rules. This isn't what `=` is doing when used to bind a name to an parameter name inside a function call argument list. There's no call happening, so `=` in this context is just a syntactic token (like `;`), not an operator. – Konrad Rudolph Nov 24 '22 at 10:17
  • I'm a newbie to R... had used S for a project about 30 years ago... and I'm still trying to get my head around its semantics! I tell myself stories about them, then find errors in the stories. My current understanding is that R, like most scripting languages, and like all natural languages, has a very complex semantics which varies with the context and has developed over time. We can still interpret historic documents and dusty-deck R scripts pretty accurately *if* we interpret them in an appropriate (historic) context. A formal semantics is more likely to confuse than to explain.... – Clark Thomborson Nov 24 '22 at 15:12
  • I don't find it helpful to think of `=` as sometimes being a token, and sometimes being a full-fledged assignment operator. To my way of thinking, `<-` is also a token... and it's the tokenization phase of parsing that makes "x<-3" problematic. The operation of binding a formal name to a value in a function call occurs at the time the function call is being interpreted. The name-value binding (and the possible name-definition) of `x <- 3` occurs at the time this expression is being interpreted. There is a temporal distinction between `=` and `<-` but it has to do with operator precedence. – Clark Thomborson Nov 24 '22 at 15:30
  • 1
    @ClarkThomborson Regardless of whether you find it helpful, that's precisely what it *is:* the syntactic `=` token for named parameters completely vanishes after the parsing phase, it isn't represented in the parse tree at all, nor is it evaluated. Name binding in assignment and in function calling happens fundamentally differently in R (and, to varying extents, in other languages), it isn't merely a "temporal distinction", nor is it due to operator precedence. – Konrad Rudolph Nov 24 '22 at 15:33
  • Oh wow that's **very** helpful. I had guessed -- apparently incorrectly -- that the binding of formal name to actual parameter happened at runtime in R. That'd be a significant gain in efficiency of interpretation, with some loss of semantic range compared to a scripting language which binds formal names to values at runtime. So... my new "story" about the semantics of `=` and `<-` is that, in some contexts, `=` denotes a name-value binding that is interpreted immediately after the script is parsed, and in other contexts its name-value binding is delayed. – Clark Thomborson Nov 24 '22 at 15:45
  • @Clark I am not really sure what you mean by that. Of course the binding happens at runtime. It just doesn't happen via an operator (which, in R, is a *function call!)*, it doesn't happen differently whether or not an explicit parameter name is specified (i.e. whether or not `=` is present in the source code), and the `=` token for parameter name binding is not present in the parse tree (compare `as.list(str2lang('x = 1'))` and `as.list(str2lang('f(x = 1)'))`). – Konrad Rudolph Nov 24 '22 at 15:53
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/249864/discussion-between-konrad-rudolph-and-clark-thomborson). – Konrad Rudolph Nov 24 '22 at 15:55
111

Google's R style guide simplifies the issue by prohibiting the "=" for assignment. Not a bad choice.

https://google.github.io/styleguide/Rguide.xml

The R manual goes into nice detail on all 5 assignment operators.

http://stat.ethz.ch/R-manual/R-patched/library/base/html/assignOps.html

xxfelixxx
  • 6,512
  • 3
  • 31
  • 38
Nosredna
  • 83,000
  • 15
  • 95
  • 122
  • 13
    Note that any non-0 is considered `TRUE` by R. So if you intend to test if `x` is less than `-y`, you might write `if (x<-y)` which will not warn or error, and appear to work fine. It'll only be `FALSE` when `y=0`, though. – Matt Dowle Jun 08 '12 at 15:21
  • 50
    Why hurt your eyes and finger with `<-` if you can use `=`? In 99.99% of times `=` is fine. Sometimes you need `<<-` though, which is a different history. – Fernando Oct 09 '13 at 01:22
48

x = y = 5 is equivalent to x = (y = 5), because the assignment operators "group" right to left, which works. Meaning: assign 5 to y, leaving the number 5; and then assign that 5 to x.

This is not the same as (x = y) = 5, which doesn't work! Meaning: assign the value of y to x, leaving the value of y; and then assign 5 to, umm..., what exactly?

When you mix the different kinds of assignment operators, <- binds tighter than =. So x = y <- 5 is interpreted as x = (y <- 5), which is the case that makes sense.

Unfortunately, x <- y = 5 is interpreted as (x <- y) = 5, which is the case that doesn't work!

See ?Syntax and ?assignOps for the precedence (binding) and grouping rules.

Steve Pitchers
  • 7,088
  • 5
  • 41
  • 41
38

According to John Chambers, the operator = is only allowed at "the top level," which means it is not allowed in control structures like if, making the following programming error illegal.

> if(x = 0) 1 else x
Error: syntax error

As he writes, "Disallowing the new assignment form [=] in control expressions avoids programming errors (such as the example above) that are more likely with the equal operator than with other S assignments."

You can manage to do this if it's "isolated from surrounding logical structure, by braces or an extra pair of parentheses," so if ((x = 0)) 1 else x would work.

See http://developer.r-project.org/equalAssign.html

Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
27

From the official R documentation:

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

user438383
  • 5,716
  • 8
  • 28
  • 43
Haim Evgi
  • 123,187
  • 45
  • 217
  • 223
  • 12
    I think "top level" means at the statement level, rather than the expression level. So `x <- 42` on its own is a statement; in `if (x <- 42) {}` it would be an expression, and isn't valid. To be clear, this has nothing to do with whether you are in the global environment or not. – Steve Pitchers Sep 16 '14 at 09:58
  • 1
    This: “the operator = is only allowed at the top level” is a widely held misunderstanding and completely wrong. – Konrad Rudolph Mar 02 '17 at 14:20
  • This is not true - for example, this works, even though assignment is not a complete expression: `1 + (x = 2)` – Pavel Minaev Mar 05 '17 at 22:52
  • 1
    To clarify the comments by KonradRudolph and PavelMinaev, I think it's too strong to say that it's completely wrong, but there is an exception, which is when it's "isolated from surrounding logical structure, by braces or an extra pair of parentheses." – Aaron left Stack Overflow Oct 15 '19 at 13:57
  • 1
    Or in `function() x = 1`, `repeat x = 1`, `if (TRUE) x = 1`.... – moodymudskipper Jan 10 '20 at 00:18
8

This may also add to understanding of the difference between those two operators:

df <- data.frame(
      a = rnorm(10),
      b <- rnorm(10)
)

For the first element R has assigned values and proper name, while the name of the second element looks a bit strange.

str(df)
# 'data.frame': 10 obs. of  2 variables:
#  $ a             : num  0.6393 1.125 -1.2514 0.0729 -1.3292 ...
#  $ b....rnorm.10.: num  0.2485 0.0391 -1.6532 -0.3366 1.1951 ...

R version 3.3.2 (2016-10-31); macOS Sierra 10.12.1

Scarabee
  • 5,437
  • 5
  • 29
  • 55
Denis Rasulev
  • 3,744
  • 4
  • 33
  • 47
1

I am not sure if Patrick Burns book R inferno has been cited here where in 8.2.26 = is not a synonym of <- Patrick states "You clearly do not want to use '<-' when you want to set an argument of a function.". The book is available at https://www.burns-stat.com/documents/books/the-r-inferno/

Diego
  • 328
  • 2
  • 9
  • 1
    Yup, [it has been mentioned](https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-assignment-operators-in-r/69078055#comment63079875_1742550). But the question is about the *assignment operator*, whereas your excerpt concerns the syntax for passing arguments. It should be made clear (because there’s substantial confusion surrounding this point) that this is *not* the assignment operator. – Konrad Rudolph Sep 06 '21 at 17:27
-1

There are some differences between <- and = in the past version of R or even the predecessor language of R (S language). But currently, it seems using = only like any other modern language (python, java) won't cause any problem. You can achieve some more functionality by using <- when passing a value to some augments while also creating a global variable at the same time but it may have weird/unwanted behavior like in

df <- data.frame(
      a = rnorm(10),
      b <- rnorm(10)
)
str(df)
# 'data.frame': 10 obs. of  2 variables:
#  $ a             : num  0.6393 1.125 -1.2514 0.0729 -1.3292 ...
#  $ b....rnorm.10.: num  0.2485 0.0391 -1.6532 -0.3366 1.1951 ...

Highly recommended! Try to read this article which is the best article that tries to explain the difference between those two: Check https://colinfay.me/r-assignment/

Also, think about <- as a function that invisibly returns a value.

a <- 2
(a <- 2)
#> [1] 2

See: https://adv-r.hadley.nz/functions.html

Chunhui Gu
  • 29
  • 5
  • Unfortuantely Colin Fay’s article (and now your answer) repeats the common misconception about the alleged difference between `=` and `<-`. The explanation is therefore incorrect. See [my answer](https://stackoverflow.com/a/51564252/1968) for an exhaustive correction of this pernicious falsehood. To make it explicit: you can rewrite your first code to use `=` instead of `<-` without changing its meaning: `df <- data.frame(a = rnorm(10), (b = rnorm(10)))`. And just like <-`, `=` is a function that invisibly returns a value. – Konrad Rudolph Jan 14 '23 at 19:04
  • @Konrad Rudolph R uses some rules/principles when designing the language and code interpretation for efficiency and usability that not saw in other languages. I believe most people who ask the difference between `=` and `<-` is curious about why `R` has more than one assignment operator compared with other popular Science/math language such as Python. And whether I can safely just only use one `=` like in other languages. Besides, `(` is also a function in R so technically `(b = rnorm(10))` is not the same as `b <- rnorm(10)` since you can override the meaning of `(` function in codes. – Chunhui Gu Jan 15 '23 at 00:42
  • Yes that’s all true but what does this have to do with my comment? The answer to “why” is: “purely historical”, and the answer to “can I just use `=`” is “yes”. Everything else, in particular your claim that `=` can’t be used in some cases, is *incorrect*. Yes, of course you can override `(`, just like you can override `=` and `<-` so, yes, technically you can redefine them so that they are no longer identical. But surely you agree that this is a pure distraction. – Konrad Rudolph Jan 15 '23 at 11:27