This question is a follow-up from a previous one:
left outer join with data.table with different names for key variables
I have now tried to use the "on" argument in the manner suggested by Matt Dowle. I am puzzled that it does not work. I wonder if this currently works only with version 1.9.7 of data.table.
packageVersion('data.table')
# [1] ‘1.9.6’
DT1 = data.table(x1 = c("b", "c", "a", "b", "a", "b"),
x2a = as.character(1:6), m1 = seq(10, 60, by = 10))
DT1
# x1 x2a m1
# 1: b 1 10
# 2: c 2 20
# 3: a 3 30
# 4: b 4 40
# 5: a 5 50
# 6: b 6 60
DT2 = data.table(x1 = c("b", "d", "c", "b" ,"a", "a"),
x2b = c(1, 4, 7, 6, " ", " ") ,m2 = 5:10)
DT2
# x1 x2b m2
# 1: b 1 5
# 2: d 4 6
# 3: c 7 7
# 4: b 6 8
# 5: a 9
# 6: a 10
#### merge command works fine
rtL <- merge(DT1, DT2, by.x = c('x1', 'x2a'),
by.y = c('x1', 'x2b'), all.x = TRUE)
rtL
# x1 x2a m1 m2
# 1: a 3 30 NA
# 2: a 5 50 NA
# 3: b 1 10 5
# 4: b 4 40 NA
# 5: b 6 60 8
# 6: c 2 20 NA
#### Join with the X[Y] syntax with the 'on' argument
rtL2 <- DT2[DT1, on = c('x1', x2a = 'x2b')]
Error in
forderv(x, by = rightcols)
:by
value -2147483648 out of range [1,3]
What is wrong here? Does this require version 1.9.7?
##### Another attempt with the x1 variable in quotes
rtL3 <- DT2[DT1, on = c("x1", "x2a" = "x2b")]
Error in
forderv(x, by = rightcols)
:by
value -2147483648 out of range [1,3]
It seems to me that the rtL2
version is more correct.
How do I interpret the error message? And what is it that I am doing wrong here?