I have a large time-series data set with 8 key data variables. I am basically looking for the same data for the year before and creating a cumulative sum.
I have a code that works but it is incredibly slow, is there a more efficient way to
My current code works it is slow:
EDIT: My code wasn't reproducible. I am sorry for that. After more effort than I am proud of it should be now.
Source data
Data <- data.frame(
V1 = rep(1:50, 25*3*20),
V2 = rep(1:25, each= 50, 3*20),
V3 = rep(1:3, each=50*25, 20),
Year = rep(1:20, each=50*25*3),
YearTotal = rnorm(75000),
CumulativeSum = 0)
Working but very slow code
for (i in 1:75000){
Data$CumulativeSum[i] <- Data$YearTotal[i] +
ifelse(Data$Year[i] == 1, 0,
Data$CumulativeSum[
Data$V1 == Data$V1[i] &
Data$V2 == Data$V2[i] &
Data$V3 == Data$V3[i] &
Data$Year + 1 == Data$Year[i] ])
}
On my actual code which is quite a bit longer, and has character and period variables, takes me over 30 minutes to calculate. I couldn't find any similar problems on StackOverflow - maybe I didn't know what to search. I can't help but believe there must be a more elegant and fast way to get it done.