Consider two data tables where the number of key columns differ:
library(data.table)
tmp_dt <- data.table(group1 = letters[1:5], group2 = c(1, 1, 2, 2, 2), a = rnorm(5), key = c("group1", "group2"))
tmp_dt2 <- data.table(group2 = c(1, 2, 3), color = c("r", "g", "b"), key = "group2")
I want to join tmp_dt
to tmp_dt2
by group2
, however the following fails:
tmp_dt[tmp_dt2]
> tmp_dt[tmp_dt2]
Error in bmerge(i, x, leftcols, rightcols, io, xo, roll, rollends, nomatch, :
x.'group1' is a character column being joined to i.'group2' which is type 'double'. Character columns must join to factor or character columns.
This makes sense since it tries to join the data tables on the first key variable. How do I fix it so that the behaviour is the same as dplyr::inner_join
, without incurring overheads in resetting the key on tmp_dt
twice?
> inner_join(tmp_dt, tmp_dt2, by = "group2")
group1 group2 a color
1 a 1 0.2501413 r
2 b 1 0.6182433 r
3 c 2 -0.1726235 g
4 d 2 -2.2239003 g
5 e 2 -1.2636144 g