2

I'm backtesting trading strategies with R. This is at the moment my code. - MergedSet$FXCloseRate contains the closing price for a certain currency pair - MergedSet$RiskMA is the moving average of a certain risk index - the rest should be clear This formula at the Moment is not really fast over 11'000 entries. Why? Are data frames too slow? Where can I optimise here?

############
# STRATEGY #
############
#Null out trades and position
MergedSet$Trade <- 0
MergedSet$Position<-0
MergedSet$DailyReturn<-0
MergedSet$CumulativeReturn<-0
MergedSet$Investment<-0
MergedSet$CumulativeReturn[1:MAPeriod] <- 1
MergedSet$Investment[1:MAPeriod] <- InitialInvestment


#Strategy
n<-nrow(MergedSet)
for(i in seq(MAPeriod+1,n)){
  #Updating the position
  if(MergedSet$RiskMA[i] <= ParamDwn && MergedSet$RiskMA[i-1] > ParamDwn){
    #sell signal, so short if no or long position active otherwise do nothing
    if(MergedSet$Position[i-1] == 0 || MergedSet$Position[i-1] == 1){
      MergedSet$Position[i] = -1
      MergedSet$Trade[i] = 1
    }
  } else if(MergedSet$RiskMA[i] >= ParamUp && MergedSet$RiskMA[i-1] < ParamUp){
    #buy signal, go long if no or short position active, otherwise do nothing
    if(MergedSet$Position[i-1] == 0 || MergedSet$Position[i-1] == -1){
      MergedSet$Position[i] = 1
      MergedSet$Trade[i] = 1
    }
  } else {
    MergedSet$Position[i] = MergedSet$Position[i-1]
  }

  #Return calculation
  if(MergedSet$Position[i] == 1){
    #long
    MergedSet$DailyReturn[i] = MergedSet$FXCloseRate[i]/MergedSet$FXCloseRate[i-1]-1
  } else if(MergedSet$Position[i] == -1){
    #short
    MergedSet$DailyReturn[i] = MergedSet$FXCloseRate[i-1]/MergedSet$FXCloseRate[i]-1
  }
}
MichiZH
  • 5,587
  • 12
  • 41
  • 81
  • You're not pre-allocating. Do that, and gaze. – Roman Luštrik Jun 19 '13 at 12:41
  • @RomanLuštrik: looks pre-allocated to me. More likely the issue is that subsetting data.frames is slow. – Joshua Ulrich Jun 19 '13 at 12:43
  • 1
    `[<-.data.frame` and `[.data.frame` are relatively slow. If you can use a matrix, i.e., all data is numeric, you should do so. Or work with individual vectors and `cbind` them or put them in your data.frame after the loop. – Roland Jun 19 '13 at 12:45
  • http://stackoverflow.com/a/8474941/636656 See in particular the advice for `data.table` (which will make you subsets very fast) and avoiding loops. – Ari B. Friedman Jun 19 '13 at 12:49
  • Thx I thought about matrices. But how could I convert time back and forth? And how do I "null" in the beginning all parts of the matrix? Do you have some sample code for this problem? I wasn't able to find something exactly for that? – MichiZH Jun 19 '13 at 12:52
  • 1
    Instead of using a data frame, just work with individual vectors in the loop, and then create the data frame at the end. (i.e. delete `MergedSet$` every where). That will be _much_ faster will minimal code changes. – hadley Jun 19 '13 at 13:30
  • But I don't know how to get the vectors from the data frame (and that all vectors have the same length)? Since I Import the data into a data frame with read.table... – MichiZH Jun 19 '13 at 13:32
  • 1
    x = as.numeric(MergedSet$Position) is an example of getting a vector from a data.frame. – zkurtz Jun 19 '13 at 13:44

0 Answers0