Much appreciated for your comment/answer.
Context: I have a large data table of daily prices of swap rates across a dozen countries. The columns are [ID, Date, X1Y, X2Y, X3Y ... X30Y], where X..Y are columns indicating the part of the yield curve (e.g. X1Y is 1-year swap, X3Y is 3-year swap). The two keys are ID (e.g. "AUD", "GBP") and Date (e.g. "2001-04-13", "2001-04-16").
Dummy Data:
set.seed(123)
dt <- cbind(ID=rep(c("AUD","GBP"),c(100,100)),X1Y=rnorm(200),X2Y=rnorm(200),X3Y=rnorm(200))
dt <- data.table(dt)
dt[,Date := seq(from=as.IDate("2013-01-01"), by="1 day", length.out=100)]
setkeyv(dt,c("ID","Date"))
Problem 1: First generate some dummy signals. What's the syntax if there is 100 columns with fairly complicated signal generation formula coded in a separate function say genSig(X1Y)? Here's what I mean using just the 3 columns and some meaningless formula:
dt[,SIG1 :=c(0, diff(X1Y ,1)),by="ID"]
dt[,SIG2 :=c(0, diff(X2Y ,1)),by="ID"]
dt[,SIG3 :=c(0, diff(X3Y ,1)),by="ID"]
Problem 2: Carry forward column(s) based on "middle of the month". For example, using the SIG columns, I'd like to make everything after say the 15th of each month the same as the signal on the 15th, until next month's 15th. The tricky thing is that the actual data contains just trading days so some months do not have 15th if it is a weekend/holiday. Another issue is using an efficient syntax, I could achieve something similar using loop (I know..) for the start of each month just to show what I meant:
for (i in 2:length(dt$Date)){
if(as.POSIXlt(dt[i,]$Date)$mon == as.POSIXlt(dt[i-1,]$Date)$mon){
dt[i, SIG1 := dt[i-1,SIG1]]
dt[i, SIG2 := dt[i-1,SIG2]]
dt[i, SIG3 := dt[i-1,SIG3]]
}
}
I can't figure out how to deal with the "mid-month" issue since it can fall on the 15th or 16th or 17th. Like Problem 1, would appreciate if there is a smart way to insert/update multiple/dozen columns.