I have a data frame called "marketdata" which contains 3,000,000 rows (rownames: 1 to 3,000,000) and 2 columns (colnames: "mid", "bo").
> head(marketdata)
mid bo
1 250 0.05
2 251 0.07
3 252 0.13
4 249 0.08
5 250 0.12
My function is as follows:
movingWindow <- function (submarketdata) {
temp <- submarketdata[submarketdata$bo <= 0.1, ]
return( c(mean(temp$mid), NROW(temp)/100) )
}
result <- lapply(c(101:NROW(marketdata)), function(i) movingWindow( marketdata[ (i-99):i , ] ))
For example, for row 101, I will search through marketdata[2:101,]
. Then find out those rows that have "bo" value <= 0.1 as "effective sample". And finally calculate the average of these "effective samples" and the percentage of them.
However, this script runs really slow. It took about 15 minutes to finish all the 3,000,000 lines. Could any one help me to speed this up? Thank you.