Let's have one irregular time series table, e.g. buy/sell transactions and resulting positions on stocks (here Apple):
library(data.table)
txns = data.table(Instrument=rep("AAPL", 3),
Date=as.Date(c("2014-05-10","2014-05-15", "2014-05-20")),
Txn.Qty=c(100, -20, -30), #
key=c("Instrument", "Date"))
txns[, Pos.Qty:=cumsum(Txn.Qty), by="Instrument"]
txns
# Instrument Date Txn.Qty Pos.Qty
# 1: AAPL 2014-05-10 100 100
# 2: AAPL 2014-05-15 -20 80
# 3: AAPL 2014-05-20 -30 50
Then we want to track positions (and possibly daily valuation) in a second, mostly regular time-series table:
positions = data.table(Instrument=rep("AAPL", 13),
Date=as.Date(setdiff(as.Date("2014-05-07") + 1:15,
txns$Date[1:2])), # gaps are possible
key=c("Instrument", "Date"))
out = merge(positions, txns, all=T)
out[!txns, Txn.Qty:=0]
out[, Pos.Qty:=zoo::na.locf(Pos.Qty, na.rm=F), by=Instrument] # `zoo` dependency
out
# Instrument Date Txn.Qty Pos.Qty
# 1: AAPL 2014-05-08 0 NA
# 2: AAPL 2014-05-09 0 NA
# 3: AAPL 2014-05-10 100 100
# 4: AAPL 2014-05-11 0 100
# 5: AAPL 2014-05-12 0 100
# 6: AAPL 2014-05-13 0 100
# 7: AAPL 2014-05-14 0 100
# 8: AAPL 2014-05-15 -20 80
# 9: AAPL 2014-05-16 0 80
# 10: AAPL 2014-05-17 0 80
# 11: AAPL 2014-05-18 0 80
# 12: AAPL 2014-05-19 0 80
# 13: AAPL 2014-05-20 -30 50
# 14: AAPL 2014-05-21 0 50
# 15: AAPL 2014-05-22 0 50
I was hoping to speed up the above (i.e. roll last observation forward in some columns, fill non-joins with zeros for other columns) with a single, efficient join and roll
argument, performing selectively only on some of the columns (similar to .SDcols
philosophy).
Is it possible to achieve something like this using data.table
's roll
argument on selected columns only?
Also, additional argument fill
in [.data.table
would help, equivalent to na.fill
(we can think about roll
corresponding to na.locf
)