Given the structure of my data below, I would like to calculate the forward five years average annual growth rate (where the first year is lead(markup)
of the column markup
for each individual (individual are identified in the column (gvkey)
), and add to the data frame that average as a column. However, some individuals have less than five years observations, and for all individuals', their last 4 years observations, have less than 5 years of observations ahead of them. For those cases, the average annual growth rate should adjust to the number of observation ahead of them (with a maximum of 5).
dput(example)
structure(list(gvkey = c(1001L, 1001L, 1001L, 1003L, 1003L, 1003L,
1003L, 1003L, 1003L, 1003L, 1004L, 1004L, 1004L, 1004L, 1004L,
1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L,
1004L, 1004L, 1004L, 1004L, 1004L, 1004L), fyear = c(1983L, 1984L,
1985L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1980L,
1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L,
1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
1999L), markup = c(3.02456418383518, 2.91714600416106, 2.97620103473762,
0.628645648836935, 0.538264738598443, 0.74536402337831, 0.89905329776662,
0.571759161863088, 0.510497237569061, 0.621391904401246, 0.320146680750145,
0.277978758953348, 0.31442332968701, 0.319433516915814, 0.324865816687745,
0.335264348013352, 0.328048313395744, 0.326632245360565, 0.340874293859881,
0.320374201245953, 0.27456562124358, 0.276693369097675, 0.245072145096866,
0.241026046834387, 0.242841330851661, 0.249635000371186, 0.257903948772679,
0.262641379065405, 0.261534064206543, 0.22953354130982)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -30L), groups = structure(list(
gvkey = c(1001L, 1003L, 1004L), .rows = structure(list(1:3,
4:10, 11:30), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))
This is as far as I got:
example %>%
filter(fyear %in% 1980:2019)%>%
group_by((gvkey))%>%
mutate(markupchange = ((((lead(markup)-markup)/markup)+(lead(markup, K =2)-lead(markup)/lead(markup))+(lead(markup, K =3)-lead(markup, k =2)/lead(markup, K=2))+(lead(markup, K =4)-lead(markup, k =3)/lead(markup, K=3))+(lead(markup, K =5)-lead(markup, k =4)/lead(markup, K=4))/5)))
What I can't figure out how to indicate is to shorten the lenght of the average annual growth rate for thoses cases with less than 5 observations ahead.
As an ouput I would like to get back the same data frame with the extra column for the average annual growth rate of the markup
.
The value in row 1 of the added column should be -0,00628231878798876 and in the second row 0,020547945.
Many thanks for any tips.