I have the following problem and I really don't know how to solve it.
I am trying two join two data.tables - the first one containing more than 8000 rows and the second one with more than 10000 rows. For reasons of simplicity take the first DT to be like:
DT1 <- data.table(ticker=LETTERS[rep(seq(1,4), each=3)],
date=seq(as.Date("2015-03-08"), by= "day", length.out = 3),
PX_LAST = c(10,12,14,8,4,6,7,5,7,6,3,5), ticker2=rep(c("GLEN LN", "JPEE IA", "BEE GR", "ISA GG"), each=3))
and the second DT as:
DT2 <- data.table(ticker=rep(c("GLEN LN", "JPEE IA", "BEE GR", "ISA GG"), each=5),
date=seq(as.Date("2015-03-07"), by= "day", length.out = 5),
Z_SPRD=c(66,76,68,70,69,112,117,125,122,114,22,25,20,21,24,323,270,290,333,299),
BID=c(6.2,7.2,8.5,7.9,6.9,11.2,11.7,12.5,12.2,11.4,2.5,2.5,2.4,2.1,2.4,32.1,27.4,29.3,33.8,29.2))
What I want know is to join on the keys "date"
and "ticker2 = ticker"
.
The final output should look like:
DT3 <- data.table(ticker=LETTERS[rep(seq(1,4), each=3)],
date=seq(as.Date("2015-03-08"), by= "day", length.out = 3),
PX_LAST = c(10,12,14,8,4,6,7,5,7,6,3,5), ticker2=rep(c("GLEN LN", "JPEE IA", "BEE GR", "ISA GG"), each=3),
Z_SPRD=c(76,68,70,117,125,122,25,20,21,270,290,333),
BID=c(7.2,8.5,7.9,11.7,12.5,12.2,2.5,2.4,2.1,27.4,29.3,33.8))
ticker date PX_LAST ticker2 Z_SPRD BID
1: A 2015-03-08 10 GLEN LN 76 7.2
2: A 2015-03-09 12 GLEN LN 68 8.5
3: A 2015-03-10 14 GLEN LN 70 7.9
4: B 2015-03-08 8 JPEE IA 117 11.7
5: B 2015-03-09 4 JPEE IA 125 12.5
6: B 2015-03-10 6 JPEE IA 122 12.2
7: C 2015-03-08 7 BEE GR 25 2.5
8: C 2015-03-09 5 BEE GR 20 2.4
9: C 2015-03-10 7 BEE GR 21 2.1
10: D 2015-03-08 6 ISA GG 270 27.4
11: D 2015-03-09 3 ISA GG 290 29.3
12: D 2015-03-10 5 ISA GG 333 33.8
Any help is much appreciated. Perhaps the solution is quite simple but not for me.
Thanks!