-5

I read the Google style guide for R. For "Assignment", they say:

Use <-, not =, for assignment.

GOOD:
x <- 5

BAD:
x = 5

Can you tell me what the difference is between these two methods of assignment, and why one is to be preferred over the other?

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Dr. Manuel Kuehner
  • 389
  • 1
  • 6
  • 16

2 Answers2

3

I believe there are two reasons. One is that <- and = have slightly different meanings depending on context. For example, compare the behavior of the statements:

printx <- function(x) print(x)
printx(x="hello")
printx(x<-"hello")

In the second case, printx(x<-"hello") will also assign to the parent scope, whereas printx(x="hello") will only set the parameter.

The other reason is for historical purposes. Both R, S and the "APL" languages they were based on only allowed arrow key for assignment (which historically was only one character). Ref: http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html

thc
  • 9,527
  • 1
  • 24
  • 39
2

Both are used, just in different contexts. If we don't use them in the right contexts, we'll see errors. See here:

Using <- is for defining a local variable.

#Example: creating a vector
x <- c(1,2,3)

#Here we could use = and that would happen to work in this case.

Using <<- , as Joshua Ulrich says, searches the parent environments "for an existing definition of the variable being assigned." It assigns to the global environment if no parent environments contain the variable.

#Example: saving information calculated in a function
x <- list()
this.function <– function(data){
  ...misc calculations...
  x[[1]] <<- result
}

#Here we can't use ==; that would not work.

Using = is to state how we are using something in an argument/function.

#Example: plotting an existing vector (defining it first)
first_col <- c(1,2,3)
second_col <- c(1,2,3)
plot(x=first_col, y=second_col)

#Example: plotting a vector within the scope of the function 'plot'
plot(x<-c(1,2,3), y<-c(1,2,3))

#The first case is preferable and can lead to fewer errors.

Then we use == if we're asking if one thing is equal to another, like this:

#Example: check if contents of x match those in y:
x <- c(1,2,3)
y <- c(1,2,3)
x==y
[1] TRUE TRUE TRUE

#Here we can't use <- or =; that would not work.
www
  • 4,124
  • 1
  • 11
  • 22
  • 4
    `<<-` is *not* for assigning to the global environment. It searches the parent environments "for an existing definition of the variable being assigned." The only time it assigns to the global environment is if no parent environments contain the variable. – Joshua Ulrich Sep 03 '17 at 19:41