3

Suppose I have a data table A containing a set of entries and an index column that assigns a unique number to each row. I also have a data table B that contains entries of A, like so:

library(data.table)
set.seed(1)
A <- do.call(CJ, list(seq(3), seq(2), seq(2)))
A[,index := seq(nrow(A))]
B <- data.table(sample(3,3,replace=TRUE), sample(2,3,replace=TRUE),
                sample(2,3,replace=TRUE))

I want to define an index column for B that assigns each row to the corresponding index in A. What is the most efficient way to do this with data.table?

Thanks.

user3294195
  • 1,748
  • 1
  • 19
  • 36
  • 1
    You can do `A[B, on = names(B)]` – akrun Jan 18 '17 at 16:25
  • 2
    @akrun In my opinion, that's not an appropriate dupe. OP presumably wants to add a column to `B` by reference like `B[A, on=names(B), index := i.index ]`, not materialize an entirely new table. That's one of the big reasons folks use data.table. – Frank Jan 18 '17 at 16:30
  • 1
    @Frank I reopened it. I thought it is a simple merge – akrun Jan 18 '17 at 16:32
  • Frank's answer is exactly what I was looking for. I would accept it as an answer if it wasn't a comment. – user3294195 Jan 18 '17 at 16:35

2 Answers2

3

To add a column from A to B based on their matching rows:

B[A, on=names(B), index := i.index ]

The main docs are at ?data.table

Graham
  • 7,431
  • 18
  • 59
  • 84
Frank
  • 66,179
  • 8
  • 96
  • 180
1

I think you need a join:

A[B, on = c("V1", "V2", "V3")]

#   V1 V2 V3 index
#1:  1  2  2     4
#2:  2  1  2     6
#3:  2  2  2     8
Psidom
  • 209,562
  • 33
  • 339
  • 356