I have two datatable
that I would like to Join by the two key columns:
df1=data.table(key1=c(1.2,1.1,1.9,4.6,2.3,3.6,0.9,2.8),key2=c("A","B","A","B","A","B","A","B"))
df2=data.table(key1=c(1,1,2,3),key2=c("A","B","B","B"),value=c("A","B","D","C"))
and here is the desired output:
key1 key2 value
1: 0.9 A A
2: 1.1 A A
3: 1.2 B B
4: 1.9 A NA
5: 2.3 A NA
6: 2.8 B C
7: 3.6 B NA
8: 4.6 B NA
Matching numeric keys with nearest value could be done with the following:
# Join with closest value of Numeric Key, works fine
df1=data.table(key1=c(1.2,1.1,1.9,4.6,2.3,3.6,0.9,2,8),key2=c("B","B","A","A"))
df2=data.table(key1=c(1,1,2,3),key2=c("A","B","B","B"),value=c("A","B","D","C"))
setkey(df1,key1)
setkey(df2,key1)
df2[df1,roll="nearest"]
Also if numeric keys would have excact matches the problem could be solved again:
df1=data.table(key1=c(1,1,2,5,2,4,1,3),key2=c("A","B","A","B","A","B","A","B"))
df2=data.table(key1=c(1,1,2,3),key2=c("A","B","B","B"),value=c("A","B","D","C"))
setkey(df1,key1,key2)
setkey(df2,key1,key2)
df2[df1]
Solution does not have to be with data.table
.
Thanks