I'm happy to find data.table
has its new release, and got one question about J()
. From data.table
NEWS 1.9.2:
x[J(2), a]
, wherea
is the key column seesa
inj
, #2693 and FAQ 2.8. Also,x[J(2)]
automatically names the columns fromi
using the key columns ofx
. In cases where the key columns ofx
andi
are identical,i
's columns can be referred to by usingi.name
; e.g.,x[J(2), i.a]
There're several questions about J()
in S.O, and also the introduction to data.table
talks about the binary search of J()
. But my understanding of J()
is still not very clear.
All I know is that, if I want to select rows where "b" in column A and "d" in column B:
DT2 <- data.table(A = letters[1:5], B = letters[3:7], C = 1:5)
setkey(DT2, A, B)
DT2[J("b", "d")]
and if I want to select the rows where A = "a" or "c", I code like this
DT2[A == "a" | A == "c"]
much like the data.frame way. (minor question: how to select using a more data.table way?)
So to my understanding, 'J()
only uses in the above case. select two single value from 2 different columns.
Hope my understanding is wrong. There're few documents about J()
. I read How is J() function implemented in data.table?. J(.)
is detected and simply replaced with list(.)
It seems that every case list(.)
can replace J(.)
And back to the question, what the purpose of this new feature? x[J(2), a]
It's really appreciated if you can give some detailed explanations!