I would like to transform a data frame that has both start-year and end-year variables into a complete time series that (1) includes all the years in between start-year and end-year and (2) fills in the values of all the variables for the years in between.
This is how the original data looks like:
data_original <- data.frame(name = c("peter", "peter", "eric", "denisse"), lastname = c("smith", "smith", "jordan", "williams"), age = c(54, 54, 48, 40), start_year = c(1980,1986, 1990, 2000), end_year = c(1984, 1988, 1993, 2001))
data_original
#> name lastname age start_year end_year
#> 1 peter smith 54 1980 1984
#> 2 peter smith 54 1986 1988
#> 3 eric jordan 48 1990 1993
#> 4 denisse williams 40 2000 2001
This is how I would like the data to look like:
data_final <- data.frame(name = c("peter", "peter", "peter", "peter", "peter", "peter", "peter", "peter", "eric", "eric", "eric", "eric", "denisse", "denisse"), lastname = c("smith", "smith", "smith", "smith", "smith", "smith", "smith", "smith", "jordan", "jordan", "jordan", "jordan", "williams", "williams"), age = c(54, 54, 54, 54, 54, 54, 54, 54, 48, 48, 48, 48, 40, 40), year = c(1980, 1981, 1982, 1983, 1984, 1986, 1987, 1988, 1990, 1991, 1992, 1993, 2000, 2001))
data_final
#> name lastname age year
#> 1 peter smith 54 1980
#> 2 peter smith 54 1981
#> 3 peter smith 54 1982
#> 4 peter smith 54 1983
#> 5 peter smith 54 1984
#> 6 peter smith 54 1986
#> 7 peter smith 54 1987
#> 8 peter smith 54 1988
#> 9 eric jordan 48 1990
#> 10 eric jordan 48 1991
#> 11 eric jordan 48 1992
#> 12 eric jordan 48 1993
#> 13 denisse williams 40 2000
#> 14 denisse williams 40 2001
Many thanks in advance for this and for your continuous help!