0

I need some help vectorizing the following code because I believe that it will become more efficient. However i do not know how to begin... I created a loop that goes through z. z has 3 columns and 112847 rows, which might be a reason it takes a long time. The 3 columns contain numbers that are used in the MACD() function...

library(quantmod)
library(TTR)

# get stock data 
getSymbols('LUNA')

#Choose the Adjusted Close of a Symbol
stock <- Ad(LUNA)

#Create matrix for returns only
y <- stock

#Create a "MATRIX" by choosing the Adjusted Close 
Nudata3 <- stock

#Sharpe Ratio Matrix
SR1<- matrix(NA, nrow=1)

# I want to create a table with all possible combinations from the ranges below
i = c(2:50)
k = c(4:50)
j = c(2:50)

# stores possible combinations into z
z <- expand.grid(i,k,j)
colnames(z)<- c("one","two","three")            

n = 1
stretches <- length(z[,1])

while (n < stretches){ 

# I am trying to go through all the values in "z"
Nuw <- MACD((stock), nFast=z[n,1], nSlow=z[n,2], nSig=z[n,3], maType="EMA")

colnames(Nuw) <- c("MACD","Signal")  #change the col names to create signals
x <- na.omit(merge((stock), Nuw))

x$sig <- NA

# Create trading signals                            

sig1 <- Lag(ifelse((x$MACD <= x$Signal),-1, 0)) # short when MACD < SIGNAL 
sig2 <- Lag(ifelse((x$MACD >= x$Signal),1, 0))  # long when MACD > SIGNAL 
x$sig <- sig1 + sig2



#calculate Returns
ret <- na.omit(ROC(Ad(x))*x$sig)
colnames(ret)<- c(paste(z[n,1],z[n,2],z[n,3],sep=","))
x <- merge(ret,x)
y <- merge(y,ret) #This creates a MATRIX with RETURNs ONLY
Nudata3 <- merge(Nudata3, x)

((mean(ret)/sd(ret)) * sqrt(252))  -> ANNUAL # Creates a Ratio
ANNUAL->Shrat                                # stores Ratio into ShRat
SR1 <- cbind(SR1,Shrat)                      # binds all ratios as it loops

n <- (n+1)

}

I would like to know how to vectorize the MACD() function, to speed up the process since the length of stretches is approx. 112847. It takes my computer quite some time to go through the loop itself.

Jason
  • 311
  • 1
  • 4
  • 14
  • Please provide some [reproducible code](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Specifically, because I'm not familiar with pulling trading data, I don't know what `Ad()` is, nor what `LUNA` is (are you referring to the stock symbol LUNA?). Also, what is `MACD` supposed to be doing? – r2evans Apr 15 '14 at 04:28
  • What evidence that you have that MACD is your speed problem? You're doing a lot of merges in a while loop. That's got to be slow. You might be better off asking a question about how to accomplish what you actually want to do efficiently and then providing your code as an example of what you've tried. – John Apr 15 '14 at 04:42
  • @r2evans `Ad()` is the adjusted close of the stock data. Yes LUNA is the trading Symbol. I updated the code to require the packages used. Also the `MACD()` is "The MACD was developed by Gerald Appel and is probably the most popular price oscillator. The MACD function documented in this page compares a fast moving average (MA) of a series with a slow MA of the same series. It can be used as a generic oscillator for any univariate series, not only price." – Jason Apr 15 '14 at 04:45
  • +1 because all information a provided now. – Christian Apr 15 '14 at 06:32

1 Answers1

2

First and foremost - case specific optimization - remove the cases where nFast > nSlow as it doesn't make sense technically.

Secondly - you are creating objects and copying them over and over again. This is very expensive.

Thirdly - you can code this better perhaps by creating a matrix of signals in one loop and doing rest of the operations in vectorized manner.

I would code what you are doing something like this.

Please read help pages of mapply, do.call, merge and sapply if you don't understand.

require(quantmod)
getSymbols("LUNA")

#Choose the Adjusted Close of a Symbol
stock <- Ad(LUNA)

# I want to create a table with all possible combinations from the ranges below
i = c(2:50)
k = c(4:50)
j = c(2:50)

# stores possible combinations into z
z <- expand.grid(i,k,j)

IMO : This is where your first optimization should be. Remove cases where i > k

z <- z[z[,1]<z[,2], ]

It reduces the number of cases from 112847 to 57575

#Calculate only once. No need to calculate this in every iteration.
stockret <- ROC(stock)

getStratRet <- function(nFast, nSlow, nSig, stock, stockret) {
    x  <- MACD((stock), nFast=nFast, nSlow=nSlow, nSig=nSig, maType="EMA")
    x <- na.omit(x)
    sig <- Lag(ifelse((x$macd <= x$signal),-1, 0)) + Lag(ifelse((x$macd >= x$signal),1, 0))
    return(na.omit(stockret * sig))
}

RETURNSLIST <- do.call(merge, mapply(FUN = getStratRet, nFast = z[,1], nSlow = z[,2], nSig = z[,3], MoreArgs = list(stock = stock, stockret = stockret), SIMPLIFY = TRUE))

getAnnualSharpe <- function(ret) {
    ret <- na.omit(ret)
    return ((mean(ret)/sd(ret)) * sqrt(252))
}


SHARPELIST <- sapply(RETURNSLIST, FUN = getAnnualSharpe)

Results will be as below. Which column belongs to which combo of i, j, k is trivial.

head(RETURNSLIST[, 1:3])
##            LUNA.Adjusted LUNA.Adjusted.1 LUNA.Adjusted.2
## 2007-01-10   0.012739026    -0.012739026               0
## 2007-01-11  -0.051959739     0.051959739               0
## 2007-01-12  -0.007968170    -0.007968170               0
## 2007-01-16  -0.007905180    -0.007905180               0
## 2007-01-17  -0.005235614    -0.005235614               0
## 2007-01-18   0.028315920    -0.028315920               0

SHARPELIST
##   LUNA.Adjusted LUNA.Adjusted.1 LUNA.Adjusted.2 LUNA.Adjusted.3 LUNA.Adjusted.4 LUNA.Adjusted.5 LUNA.Adjusted.6 
##      0.04939150     -0.07428392             NaN      0.02626382     -0.06789803     -0.22584987     -0.07305477 
## LUNA.Adjusted.7 LUNA.Adjusted.8 LUNA.Adjusted.9 
##     -0.05831643     -0.08864845     -0.08221986 



system.time(
+ RETURNSLIST <- do.call(merge, mapply(FUN = getStratRet, nFast = z[1:100,1], nSlow = z[1:100,2], nSig = z[1:100,3], MoreArgs = list(stock = stock, stockret = stockret), SIMPLIFY = TRUE)),
+ SHARPELIST <- sapply(RETURNSLIST, FUN = getAnnualSharpe)
+ )
   user  system elapsed 
   2.28    0.00    2.29 
CHP
  • 16,981
  • 4
  • 38
  • 57
  • are you referring to `x$MACD <= x$Signal` ?, how can i follow through without copying over and over? – Jason Apr 15 '14 at 05:08
  • No - for MACD indicator `nFast > nSlow` doesn't make sense – CHP Apr 15 '14 at 05:10
  • I see what you mean. Thanks for the code above, but I have one question. How long does it take for your computer to run the code? I now believe that it might be my computer that is slow as it take a long time to run the code – Jason Apr 15 '14 at 06:17
  • You can't speed things up beyond certain point. I updated the timing of calculations for 100 iterations in my answer above. About 2.29 seconds. – CHP Apr 15 '14 at 06:21
  • Ok it takes my computer approximately the same for 100 iterations... I was worried because it was taking a long time for the full length. Thanks once again I appreciate it! – Jason Apr 15 '14 at 06:29