1

I have an event datasets covering goverment projects in different sectors looking like this:

project_id <- c(1,2,3,4)
district_id <- c(5,6,7,8)
start_year <- c(2000, 2011, 2006, 2004)
end_year   <- c(2002, 2020, 20010, 2015)
sector     <- c("education", "infrastructure", "education", "infrastructure")

 project_id district_id start_year end_year         sector
1          1           5       2000     2002      education
2          2           6       2011     2020 infrastructure
3          3           7       2006    20010      education
4          4           8       2004     2015 infrastructure

Is there a way to convert this to a panel data set with district-year as the unit of analysis? It would be important that the values from other columns are transferred to the "new rows" ending in:

 project_id district_id year    sector
1          3           7 2006 education
2          3           7 2007 education
3          3           7 2008 education
4          3           7 2009 education
5          3           7 2010 education

I am looking for an answer easily transferable to bigger datasets

KC15
  • 150
  • 10
  • 2
    Hi LucasLeonhard, Here are two good solutions, I hope they solve your problem! https://stackoverflow.com/a/59958215 and https://stackoverflow.com/a/55982621 . I was a bit torn when this question came up in the Reopen queue. On one hand, the duplicate question ignores the need to expand multiple data columns. On the other hand, two of its answers do correctly handle multiple data columns. In the end, I edited/commented on the other question to highlight the useful answers, and kept this one closed. – Esteis Jan 06 '23 at 20:43
  • 1
    Thank you, I appreciate your help and explanation :) – KC15 Jan 08 '23 at 07:33

0 Answers0