I have a following time-series price data:
18/01/2008 7.4811
22/01/2008 7.5267
31/01/2008 7.8289
01/02/2008 7.82
...
30/10/2008 7.81
31/10/2008 7.75
I build a function calVariation
to calculate the variation of prize as: variation = log(data/data[1,1])
.
calVariation
starts fromLine 1
of the data, i.e., calculate variations fordata[1:nrow(data),]
, then find in thevariation
result array the first value that is less than a threshold of5%
.- If nothing found, the function
calVariation
should run again but start from the next line of the data, i.e., compute variations fordata[2:nrow(data),]
- If it finds that the variation at line
n
is less than threshold5%
, it will save the column of the original data fromLine 1
toLine n
to one column of a matrixmat
. Now the original data will be reduced todata[n:nrow(data),]
and become the input forcalVariation
to calculate in the next step.
Following is my code.
pathway <- 'C:/'
decimal <- ","
threshold <- -0.05
database <- as.matrix(read.csv(paste(pathway,"Data_origin.csv",sep=""), header = FALSE, sep = ";", dec = decimal))
data_p <- as.matrix(database[,2])
data_p <- as.matrix(as.numeric(data_p))
rownames(data_p) <- database[,1]
calVariation <- function(mData, threshold){
if(nrow(mData) > 1) {
vari <- log(mData/mData[1,1])
if (any(vari < threshold) == FALSE) { # Not found any value < -5
mData <- as.matrix(mData[2:nrow(mData),])
mData <- calVariation(mData, threshold)
}
else { # Found value < -5
threshold_id <- min(which(vari < threshold))
mData <- as.matrix(mData[1:threshold_id, ])
}
} else (
mData <- NULL
)
return(mData)
}
data <- data_p
mat <- NULL
rowid <- 0
while (nrow(data) > 1 && is.null(data) == FALSE) {
temp <- matrix(NA, nrow(data_p), 2)
data <- calVariation(data, threshold)
if (is.null(data) == FALSE) {
temp[1:nrow(data), 1] <- rownames(data)
temp[1:nrow(data), 2] <- data
rowid <- rowid + nrow(data)
mat <- cbind(mat, temp)
data <- as.matrix(data_p[rowid:nrow(data_p),])
} else {
break()
}
}
It returns this error: Error in if (any(vari < threshold) == FALSE) { :
missing value where TRUE/FALSE needed
. I guess that when vari
becomes NA
this error happens but I tried to use something like is.na
function to get rid of this but it didn't work out.
The original data for the test can be found here. Many thanks in advance.