25

I have come across the popular data.table package and one thing in particular intrigued me. It has an in-place assignment operator

:=

This is not defined in base R. In fact if you didn't load the data.table package, it would have raised an error if you had tried to used it (e.g., a := 2) with the message:

Error: could not find function ":="

Also, why does := work? Why does R let you define := as infix operator while every other infix function has to be surrounded by %%, e.g.

`:=` <- function(a, b) {
   paste(a,b)
}

"abc" := "def"

Clearly it's not meant to be an alternative syntax to %function.name% for defining infix functions. Is data.table exploiting some parsing quirks of R? Is it a hack? Will it be "patched" in the future?

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
xiaodai
  • 14,889
  • 18
  • 76
  • 140
  • 5
    Please take a look at the data.table documentation, starting maybe with the FAQ. – Dirk Eddelbuettel Oct 09 '14 at 02:39
  • 1
    @DirkEddelbuettel. I understand how it's used in data.table. But the fact that R allows for such an operator to be defined and not cause a syntax error is what intrigued me. It's a fundamental question about R and maybe how it parses code. – xiaodai Oct 09 '14 at 02:44
  • 2
    AFAIK it is local to data.table and only works with the `[` subsetting. So your question is off-base (not an R quirk) which is why I sent you to the data.table docs *which discuss this*. – Dirk Eddelbuettel Oct 09 '14 at 02:49
  • 1
    @DirkEddelbuettel I think you are missing my point. I can define a funciton using `:=` <- function(a,b) paste(a,b); and I can use it by doing "abc" := "def"! But all other infix function are in the form of %in.fn%?. Why? – xiaodai Oct 09 '14 at 02:57
  • 2
    [This Q&A](http://stackoverflow.com/questions/7033106/why-has-data-table-defined-rather-than-overloading) from Matt might be very relevant here as well. – Arun Oct 09 '14 at 06:51

2 Answers2

33

It is something that the base R parser recognizes and seems to parse as a left assign (at least in terms or order of operations and such). See the C source code for more details.

as.list(parse(text="a:=3")[[1]])
# [[1]]
# `:=`
# 
# [[2]]
# a
# 
# [[3]]
# [1] 3

As far as I can tell it's undocumented (as far as base R is concerned). But it is a function/operator you can change the behavior of

`:=`<-function(a,b) {a+b}
3 := 7
# [1] 10

As you can see there really isn't anything special about the ":" part itself. It just happens to be the start of a compound token.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Eww. Don't think that is a good idea. It is documented: `?':='` brings up a help page if pkg:data.table is loaded. – IRTFM Oct 09 '14 at 02:58
  • 12
    @BondedDust But `data.table` doesn't *own* that function. They really are relying on something lying around in the parser over which they have no control. Another package could redefine `c` if they like. That's essentially what they are doing (there just happens to be no default implementation for `:=`) – MrFlick Oct 09 '14 at 03:00
  • 1
    The OP did mention the data.table package so I would argue that in that setting `:=` is "owned" by data.table. I suppose I would agree with you if you argued that the period-function ( `.()` ) has separate (localalized) "ownership" by `bquote` and plyr functions. And to my point, redefining `c()` is also a really bad idea. – IRTFM Oct 09 '14 at 03:02
  • 15
    @BondedDust But let's say that `data.table` wanted to define `~=` as new infix operator. They cannot do that because the parser would not recognize it (ie `parse(text="a~=3")` would generate an error). The "specialness" of `:=` is completely independent of the the `data.table` package. It is a big exception to the rule that custom infix operators require `%` which seems to be the spirit of the OP's question as I read it. – MrFlick Oct 09 '14 at 03:07
  • I'd like to amend my comment about redefining 'c'. It's a generic fn IIRC, so you are free to create a new class with a new meaning. – IRTFM Apr 20 '15 at 18:49
  • 2
    I'm guessing at some point R Core was thinking of implementing `:=` as an alternative to `<-` since some languages use that as the [assignment operator](http://stackoverflow.com/questions/5344694/what-does-do), but then dropped the idea after the parser was written. – BrodieG Apr 19 '17 at 22:03
  • Following up on @BrodieG, I'm curious if there are any other "vestigial" operators sitting around to be snatched up by package authors... have been toying around a bit with no luck – MichaelChirico Oct 31 '18 at 03:44
  • 1
    @MichaelChirico You can basically read the parser code. It doesn't look like there are any other low-hanging for nice operators. I mean, you can look at how `rlang` choose to change the meaning of `!!` and `!!!` despite the fact that those are not parsed as single operators. You could re-purpose something like `(( a ))` to mean something different. – MrFlick Oct 31 '18 at 16:24
10

It's not just a colon operator but rather := is a single operator formed by the colon and equal sign (just as the combination of "<" and "-" forms the assignment operator in base R). The := operator is an infix function that is defined to be part of the evaluation of the "j" argument inside the [.data.table function. It creates or assigns a value to a column designated by its LHS argument using the result of evaluating its RHS.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 5
    To the downvoters who are not explaining their concerns. This answer was written to respond to the question as originally written. You can see its original title and content by clicking on the "edited ...." link. – IRTFM Oct 09 '14 at 17:39