I have the following piece of code:
library(dplyr)
Q = 10000
span = 1995:2016
time = rep(span,times = Q, each= Q)
id = rep(1:Q,times=length(span))
s1 = rep(rnorm(Q,0,1),times=length(span))
gdp = rep(rnorm(Q,0,1),times=length(span))
e = rep(rnorm(Q,0,1),times=length(span))
dfA = data.frame(id,time,s1,e,gdp)
mgr = double()
stp = 10
for(K in seq(10,Q,stp)){
gr = double()
for(t in span){
wt1 = dfA %>% filter(time == t-1) %>%
arrange(desc(s1)) %>% mutate(w= s1/gdp)
zt1 = dfA %>% filter(time == t-1) %>% mutate(z1 = log(s1/e))
zt = dfA %>% filter(time == t) %>% mutate(z = log(s1/e))
gt = left_join(zt1,zt,by="name") %>%
mutate(g = z-z1) %>% select(name,g) %>% na.omit()
a = left_join(wt1,gt,by="name") %>% na.omit()
a = a %>% mutate(id = 1:length(a$name)) %>%
filter(id <= Q) %>% mutate(gbar = mean(g)) %>%
filter(id <= K) %>% mutate(sck = g-gbar,
gamma = w*sck)
gr = append(gr, sum(a$gamma))
}
mgr = append(mgr,mean(gr))
}
where dfA is a data frame containing an id variable and a time variable, among others. Since the time variable ranges from 1995 to 2016 and K is a sequence with step 10, I resorted to append()
to store gr
and mgr
, respectively. The problem is that it takes too long to compute.
So my question is: Is there any way to avoid using append()
to fill the vectors gr
and mgr
and thus reduce the time spent to compute the code?