data.table has introduced the := operator. Why not overload <-?
-
Let me guess: In homage to Pascal!!! – Iterator Aug 11 '11 at 22:50
-
1I guess! We couldn't choose any operator, it was just that (fortunately) R allows := to be defined. Otherwise we could have fun and define +=, -=, ~= etc :) – Matt Dowle Aug 11 '11 at 23:29
-
1Could someone please explain what "overloading <-" means? – Michael Oct 10 '12 at 22:45
-
2@Michael "Overloading" roughly speaking meant replacing `<-` with another version of `<-` that works differently. Making `<-` work differently, somehow. Instead, we used a new operator, `:=`, for clarity amongst other reasons. Almost everything in R is a function, even `<-` and `[` etc. – Matt Dowle Oct 10 '12 at 22:57
-
@Michael But probably 'overloading' was technically incorrect, as Owen pointed out in comments. I meant it in a loose sense. – Matt Dowle Oct 10 '12 at 23:05
2 Answers
There are two places that <-
could be 'overloaded' :
x[i, j] <- value # 1
x[i, {colname <- value}] # 2
The first one copies the whole of x
to *tmp*
, changes that working copy, and assigns back to x
. That's an R thing (src/main/eval.c and subassign.c) discussed recently on r-devel here. It sounded like it might be possible to change R to allow packages, or R itself, to avoid that copy to *tmp*
, but isn't currently possible, IIUC.
The second one is what Owen's answer refers to, I think. If you accept that it's ok to do assignment by reference within j
like that, then which operator? As per the comment to Owen's answer, <-
and <<-
are already used by users in j
, so we hit upon :=
.
Even if [<-
didn't copy the whole of x
, we still like :=
in j
so we can do things like this :
DT[,{newcol1:=sum(a)
newcol2:=a/newcol1}, by=group]
Where the new columns are added by reference to the table, and the RHS of each :=
is evaluated within each group. (When := within group is implemented.)
Update Oct 2012
As of 1.8.2 (on CRAN in Jul 2012), :=
by group was implemented for adding or updating single columns; i.e., single LHS of :=
. And now in v1.8.3 (on R-Forge at the time of writing), multiple columns can be added by group; e.g.,
DT[, c("newcol1","newcol2") := .(sum(a),sum(b)), by=group]
or, perhaps more elegantly :
DT[,`:=`(newcol1=sum(a),
newcol2=sum(b)), by=group]
But the iterative multiple RHS, envisaged for a while, where the 2nd expression could use the result from the first, isn't implemented yet (FR#1492). So this will still give an error "newcol1 not found"
and need to be done in two steps :
DT[,`:=`(newcol1=sum(a),
newcol2=a/newcol1), by=group]

- 33,841
- 14
- 113
- 198

- 58,872
- 22
- 166
- 224
-
3Just a minor thing, `x[i, j] <- value` isn't actually overloading `<-`, rather `<-` does what it always does by delegating to `[<-` (based on the expression, not the value type). – Owen Aug 12 '11 at 00:45
-
@Owen Ah yes, good point. Have edited and added quotes around 'overloaded'. – Matt Dowle Aug 12 '11 at 07:09
I don't think there is any technical reason this should be necessary, for the following reason: :=
is only used inside [...]
so it is always quoted. [...]
goes through the expression tree to see if :=
is in it.
That means it's not really acting as an operator and it's not really overloaded; so they could have picked pretty much any operator they wanted. I guess maybe it looked better? Or less confusing because it's clearly not <-
?
(Note that if :=
were used outside of [...]
it could not be <-
, because you can't actually overload <-
. <-
Doesn't evaluate its lefthand argument so it doesn't know what the type is).

- 38,836
- 14
- 95
- 125
-
9Yes, that's pretty much it. We tried <- first actually but that didn't fly because user code already used <- in j e.g. incrementing a group counter. Then we tried <<- but people already use that in j too, to assign to .GlobalEnv. So then we hit upon :=. – Matt Dowle Aug 11 '11 at 22:31